Load FE23¶

load TRW data from FE23 (https://doi.org/10.25921/8hpf-a451)

Dataset downloaded from NCEI: https://www.ncei.noaa.gov/access/paleo-search/study/36773

Created 25/10/2024 by Lucie Luecke (LL)

Updated 24/10/2025 by LL: tidied up and streamlined for documentation and publication Updated 21/11/2024 by LL: added csv saving of compact dataframe, removed redundant output.

Here we extract a dataframe with the following columns:

archiveType
dataSetName
datasetId
geo_meanElev
geo_meanLat
geo_meanLon
geo_siteName
interpretation_direction (new in v2.0)
interpretation_variable
interpretation_variableDetail
interpretation_seasonality (new in v2.0)
originalDataURL
originalDatabase
paleoData_notes
paleoData_proxy
paleoData_sensorSpecies
paleoData_units
paleoData_values
paleoData_variableName
year
yearUnits

We save a standardised compact dataframe for concatenation to DoD2k

Set up working environment¶

Make sure the repo_root is added correctly, it should be: your_root_dir/dod2k This should be the working directory throughout this notebook (and all other notebooks).

In [1]:

Copied!





%load_ext autoreload
%autoreload 2

import sys
import os
from pathlib import Path

# Add parent directory to path (works from any notebook in notebooks/)
# the repo_root should be the parent directory of the notebooks folder
init_dir = Path().resolve()
print(init_dir)
# Determine repo root
if init_dir.name == 'dod2k': repo_root = init_dir
elif init_dir.parent.name == 'dod2k': repo_root = init_dir.parent
elif init_dir.parent.parent.name == 'dod2k': repo_root = init_dir.parent.parent
else: raise Exception('Please review the repo root structure (see first cell).')

# Update cwd and path only if needed
if os.getcwd() != str(repo_root):
    os.chdir(repo_root)
if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))

print(f"Repo root: {repo_root}")
if str(os.getcwd())==str(repo_root):
    print(f"Working directory matches repo root. ")
%load_ext autoreload
%autoreload 2

import sys
import os
from pathlib import Path

# Add parent directory to path (works from any notebook in notebooks/)
# the repo_root should be the parent directory of the notebooks folder
init_dir = Path().resolve()
print(init_dir)
# Determine repo root
if init_dir.name == 'dod2k': repo_root = init_dir
elif init_dir.parent.name == 'dod2k': repo_root = init_dir.parent
elif init_dir.parent.parent.name == 'dod2k': repo_root = init_dir.parent.parent
else: raise Exception('Please review the repo root structure (see first cell).')

# Update cwd and path only if needed
if os.getcwd() != str(repo_root):
    os.chdir(repo_root)
if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))

print(f"Repo root: {repo_root}")
if str(os.getcwd())==str(repo_root):
    print(f"Working directory matches repo root. ")

/home/jupyter-lluecke/dod2k_v2.0/dod2k/notebooks
Repo root: /home/jupyter-lluecke/dod2k_v2.0/dod2k
Working directory matches repo root.

In [2]:

Copied!





import xarray as xr
import pandas as pd
import numpy as np

from dod2k_utilities import ut_functions as utf # contains utility functions
from dod2k_utilities import ut_plot as uplt # contains plotting functions
import xarray as xr
import pandas as pd
import numpy as np

from dod2k_utilities import ut_functions as utf # contains utility functions
from dod2k_utilities import ut_plot as uplt # contains plotting functions

load the source data¶

Specify the data and metadata which we are looking to extract from FE23 for the standardised 'compact dataframe':

In [3]:

Copied!

vars = ['chronos', 'lonlat', 'investigator', 'trwsSm', 'chronology', 'country', 'species', 
        'elevation', 'sitename', 'treetime']
vars = ['chronos', 'lonlat', 'investigator', 'trwsSm', 'chronology', 'country', 'species', 
        'elevation', 'sitename', 'treetime']

In order to get the source data, run the cell below (Warning: this is downloading a very large netCDF), which downloads the full dataset from NCEI (25GB) and extracts a slice based on the relevant metadata (~60MB).

Alternatively skip the cell and directly use the slice as provided in this directory (see cell below next).

In [4]:

Copied!





# # download and unzip FE23 
# !wget -O /data/fe23/franke2022-fe23.nc https://www.ncei.noaa.gov/pub/data/paleo/contributions_by_author/franke2022/franke2022-fe23.nc
# fe23_full  = xr.open_dataset('fe23/franke2022-fe23.nc')

# # save slice of FE23 with only relevant variables as netCDF (fe23_full is 25GB)
# fe23_slice = fe23_full[vars]
# fe23_slice.to_netcdf('data/fe23/franke2022-fe23_slice.nc')
# # download and unzip FE23 
# !wget -O /data/fe23/franke2022-fe23.nc https://www.ncei.noaa.gov/pub/data/paleo/contributions_by_author/franke2022/franke2022-fe23.nc
# fe23_full  = xr.open_dataset('fe23/franke2022-fe23.nc')

# # save slice of FE23 with only relevant variables as netCDF (fe23_full is 25GB)
# fe23_slice = fe23_full[vars]
# fe23_slice.to_netcdf('data/fe23/franke2022-fe23_slice.nc')

In [5]:

Copied!

fe23_slice = xr.open_dataset('data/fe23/franke2022-fe23_slice.nc')
fe23_slice = xr.open_dataset('data/fe23/franke2022-fe23_slice.nc')

In [6]:

Copied!

print(fe23_slice)
print(fe23_slice)

<xarray.Dataset> Size: 58MB
Dimensions:       (ttime: 1159, nseries: 278, nregion: 22, lonlat: 2,
                   nchars_cinv: 42, nchars_chr: 32, nchars_ctry: 22,
                   nchars_csp: 6, nchars_cn: 51)
Coordinates:
    lonlat        (nseries, nregion, lonlat) float64 98kB ...
Dimensions without coordinates: ttime, nseries, nregion, nchars_cinv,
                                nchars_chr, nchars_ctry, nchars_csp, nchars_cn
Data variables:
    chronos       (ttime, nseries, nregion) float64 57MB ...
    investigator  (nchars_cinv, nseries, nregion) |S1 257kB ...
    trwsSm        (nseries, nregion) float64 49kB ...
    chronology    (nchars_chr, nseries, nregion) |S1 196kB ...
    country       (nchars_ctry, nseries, nregion) |S1 135kB ...
    species       (nchars_csp, nseries, nregion) |S1 37kB ...
    elevation     (nseries, nregion) float64 49kB ...
    sitename      (nchars_cn, nseries, nregion) |S1 312kB ...
    treetime      (ttime) float64 9kB ...
Attributes:
    reference:      Franke, J; Evans, MN; Schurer, AP; Hegerl, GC, 2022, Clim...
    doi:            https://doi.org/10.25921/8hpf-a451
    creation_time:  27-Oct-2024 11:45:29

In [7]:

Copied!





df_fe23 = {}

for var in vars:
    print(var)
    df_fe23[var] = []
    for ii in fe23_slice.nregion:        # loop through the regions
        fe23_slice[var] = np.squeeze(fe23_slice[var])
        # print(fe23_full[var].shape)
        for jj in fe23_slice.nseries:        # loop through the records in any one region
            if var in ['chronos']:  data = fe23_slice[var][:, jj, ii].data
            elif var in ['trwsSm', 'elevation']: data = float(fe23_slice[var][jj, ii].data)
            elif var in ['lonlat', 'trwsSm']:    data = fe23_slice[var][jj, ii, :].data
            elif var in ['investigator', 'chronology', 'country', 'species', 'sitename']:
                data = b''.join([ss for ss in fe23_slice[var][:, jj, ii].data]).decode("latin-1").replace(' ','')
    
            if ~np.all(np.isnan(fe23_slice['chronos'][:, jj, ii].data)):
                df_fe23[var].append(data)
# len(all_trees)
df_fe23 = {}

for var in vars:
    print(var)
    df_fe23[var] = []
    for ii in fe23_slice.nregion:        # loop through the regions
        fe23_slice[var] = np.squeeze(fe23_slice[var])
        # print(fe23_full[var].shape)
        for jj in fe23_slice.nseries:        # loop through the records in any one region
            if var in ['chronos']:  data = fe23_slice[var][:, jj, ii].data
            elif var in ['trwsSm', 'elevation']: data = float(fe23_slice[var][jj, ii].data)
            elif var in ['lonlat', 'trwsSm']:    data = fe23_slice[var][jj, ii, :].data
            elif var in ['investigator', 'chronology', 'country', 'species', 'sitename']:
                data = b''.join([ss for ss in fe23_slice[var][:, jj, ii].data]).decode("latin-1").replace(' ','')
    
            if ~np.all(np.isnan(fe23_slice['chronos'][:, jj, ii].data)):
                df_fe23[var].append(data)
# len(all_trees)

chronos
lonlat
investigator
trwsSm
chronology
country
species
elevation
sitename
treetime

create compact dataframe¶

Create empty dataframe and populate with the data from the netCDF

In [8]:

Copied!





df_compact = pd.DataFrame(columns=['archiveType', 'interpretation_variable', 'dataSetName', 'datasetId', 
                                   'geo_meanElev', 'geo_meanLat', 'geo_meanLon', 'geo_siteName', 
                                   'originalDatabase', 'originalDataURL', 'paleoData_notes', 'paleoData_proxy', 
                                   'paleoData_units', 'paleoData_values', 'year', 'yearUnits'])
df_compact = pd.DataFrame(columns=['archiveType', 'interpretation_variable', 'dataSetName', 'datasetId', 
                                   'geo_meanElev', 'geo_meanLat', 'geo_meanLon', 'geo_siteName', 
                                   'originalDatabase', 'originalDataURL', 'paleoData_notes', 'paleoData_proxy', 
                                   'paleoData_units', 'paleoData_values', 'year', 'yearUnits'])

In [9]:

Copied!

df_compact['paleoData_values'] = df_fe23['chronos']
df_compact['year']             = [fe23_slice.treetime.data for ii in range(len(df_compact))]
df_compact['paleoData_values'] = df_fe23['chronos']
df_compact['year']             = [fe23_slice.treetime.data for ii in range(len(df_compact))]

The netCDF has a homogeneous time coordinate, but may have missing values. We use only not-nan data:

In [10]:

Copied!





for ii in df_compact.index:
    dd=utf.convert_to_nparray(df_compact.at[ii, 'paleoData_values'])
    df_compact.at[ii, 'paleoData_values']=dd.data[~dd.mask]
    df_compact.at[ii, 'year']=np.array(df_compact.at[ii, 'year'])[~dd.mask]
for ii in df_compact.index:
    dd=utf.convert_to_nparray(df_compact.at[ii, 'paleoData_values'])
    df_compact.at[ii, 'paleoData_values']=dd.data[~dd.mask]
    df_compact.at[ii, 'year']=np.array(df_compact.at[ii, 'year'])[~dd.mask]

In [11]:

Copied!

df_compact[['geo_meanLon', 'geo_meanLat']] = df_fe23['lonlat']
df_compact['geo_meanElev']                 = df_fe23['elevation']
df_compact[['geo_meanLon', 'geo_meanLat']] = df_fe23['lonlat']
df_compact['geo_meanElev']                 = df_fe23['elevation']

In [12]:

Copied!





df_compact['datasetId']   = df_fe23['chronology']
df_compact['datasetId']   = df_compact['datasetId'].apply(lambda x: x.replace('.rwl',''))
df_compact['dataSetName'] = df_compact['datasetId']
df_compact['datasetId']   = df_compact['datasetId'].apply(lambda x: 'FE23_'+x)
df_compact['datasetId']   = df_fe23['chronology']
df_compact['datasetId']   = df_compact['datasetId'].apply(lambda x: x.replace('.rwl',''))
df_compact['dataSetName'] = df_compact['datasetId']
df_compact['datasetId']   = df_compact['datasetId'].apply(lambda x: 'FE23_'+x)

Keep populating the metadata columns from the netCDF metadata.

The original data URL can be reconstructed from NCEI using the dataSetName.

In [13]:

Copied!





url = 'https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/'
df_compact['geo_siteName']            = df_fe23['sitename']
df_compact['paleoData_sensorSpecies'] = df_fe23['species']
df_compact['paleoData_notes']         = df_fe23['investigator']
df_compact['paleoData_notes']         = df_compact['paleoData_notes'].apply(lambda x: 'Investigator: '+x)
df_compact['originalDataURL']         = df_compact['dataSetName'].apply(lambda x: url+x.replace('_','/')+'-noaa.rwl')
url = 'https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/'
df_compact['geo_siteName']            = df_fe23['sitename']
df_compact['paleoData_sensorSpecies'] = df_fe23['species']
df_compact['paleoData_notes']         = df_fe23['investigator']
df_compact['paleoData_notes']         = df_compact['paleoData_notes'].apply(lambda x: 'Investigator: '+x)
df_compact['originalDataURL']         = df_compact['dataSetName'].apply(lambda x: url+x.replace('_','/')+'-noaa.rwl')

In [14]:

Copied!





df_compact['archiveType']      = 'Wood' # fills called 'paleoData_variableName' 
df_compact['paleoData_proxy']  = 'ring width' # fills column called 'paleoData_variableName' 
df_compact['paleoData_units']  = 'standardized_anomalies' # fills column called 'paleoData_units' 
df_compact['originalDatabase'] = 'FE23 (Breitenmoser et al. (2014))' # fills column 'originalDatabase' 
df_compact['yearUnits']        = 'CE'  # fills column 'yearUnits'
df_compact['paleoData_variableName'] = 'ring width'  # fills column 'yearUnits'
df_compact['archiveType']      = 'Wood' # fills called 'paleoData_variableName' 
df_compact['paleoData_proxy']  = 'ring width' # fills column called 'paleoData_variableName' 
df_compact['paleoData_units']  = 'standardized_anomalies' # fills column called 'paleoData_units' 
df_compact['originalDatabase'] = 'FE23 (Breitenmoser et al. (2014))' # fills column 'originalDatabase' 
df_compact['yearUnits']        = 'CE'  # fills column 'yearUnits'
df_compact['paleoData_variableName'] = 'ring width'  # fills column 'yearUnits'

The climate interpretation variable in the netCDF is given as an integer (1: temperature sensitive, 2: moisture sensitive, 3: temperature and moisture sensitive, 4: not temperature and not moisture sensitive)

In [15]:

Copied!





TM = {1.:'temperature', 2.:'moisture', 3.:'temperature+moisture', 4.: 'NOT temperature NOT moisture', 0:'nan'}
df_compact['interpretation_variable'] = df_fe23['trwsSm']
df_compact['interpretation_variable'] = df_compact['interpretation_variable'].apply(lambda x: TM[x] if ~np.isnan(x) else 'N/A')
df_compact['interpretation_variableDetail'] = 'N/A'
df_compact['interpretation_seasonality'] = 'N/A'
df_compact['interpretation_direction'] = 'N/A'
TM = {1.:'temperature', 2.:'moisture', 3.:'temperature+moisture', 4.: 'NOT temperature NOT moisture', 0:'nan'}
df_compact['interpretation_variable'] = df_fe23['trwsSm']
df_compact['interpretation_variable'] = df_compact['interpretation_variable'].apply(lambda x: TM[x] if ~np.isnan(x) else 'N/A')
df_compact['interpretation_variableDetail'] = 'N/A'
df_compact['interpretation_seasonality'] = 'N/A'
df_compact['interpretation_direction'] = 'N/A'

Drop rows with no data, all zero rows, all nan rows, all constant rows

In [16]:

Copied!





drop_inds = []
for ii in range(df_compact.shape[0]):
    if len(df_compact.iloc[ii]['year'])==0:
        print('empty', ii, df_compact.iloc[ii]['year'], df_compact.iloc[ii]['originalDatabase'])
        print(df_compact.iloc[ii]['paleoData_values'])
        drop_inds += [df_compact.index[ii]]
        
for ii, row in enumerate(df_compact.paleoData_values):
    if np.std(row)==0: 
        print(ii, 'std=0')
    elif np.sum(np.diff(row)**2)==0: 
        print(ii, 'diff=0')
    elif np.isnan(np.std(row)):
        print(ii, 'std nan')
    else:
        continue
    if df.index[ii] not in drop_inds: 
        drop_inds += [df_compact.index[ii]]
    
print(drop_inds)
df_compact = df_compact.drop(index=drop_inds)
drop_inds = []
for ii in range(df_compact.shape[0]):
    if len(df_compact.iloc[ii]['year'])==0:
        print('empty', ii, df_compact.iloc[ii]['year'], df_compact.iloc[ii]['originalDatabase'])
        print(df_compact.iloc[ii]['paleoData_values'])
        drop_inds += [df_compact.index[ii]]
        
for ii, row in enumerate(df_compact.paleoData_values):
    if np.std(row)==0: 
        print(ii, 'std=0')
    elif np.sum(np.diff(row)**2)==0: 
        print(ii, 'diff=0')
    elif np.isnan(np.std(row)):
        print(ii, 'std nan')
    else:
        continue
    if df.index[ii] not in drop_inds: 
        drop_inds += [df_compact.index[ii]]
    
print(drop_inds)
df_compact = df_compact.drop(index=drop_inds)

[]

Check that the datasetId is unique and that each record has an ID

In [17]:

Copied!

#  check that the datasetId is unique 
assert len(df_compact.datasetId.unique())==len(df_compact)
#  check that the datasetId is unique 
assert len(df_compact.datasetId.unique())==len(df_compact)

save compact dataframe¶

save pickle¶

In [18]:

Copied!

# save to a pickle file (security: is it better to save to csv?)
df_compact = df_compact[sorted(df_compact.columns)]
df_compact.to_pickle('data/fe23/fe23_compact.pkl')
# save to a pickle file (security: is it better to save to csv?)
df_compact = df_compact[sorted(df_compact.columns)]
df_compact.to_pickle('data/fe23/fe23_compact.pkl')

save csv¶

In [19]:

Copied!

# save to a list of csv files (metadata, data, year)
df_compact.name='fe23'
utf.write_compact_dataframe_to_csv(df_compact)
# save to a list of csv files (metadata, data, year)
df_compact.name='fe23'
utf.write_compact_dataframe_to_csv(df_compact)

METADATA: datasetId, archiveType, dataSetName, geo_meanElev, geo_meanLat, geo_meanLon, geo_siteName, interpretation_direction, interpretation_seasonality, interpretation_variable, interpretation_variableDetail, originalDataURL, originalDatabase, paleoData_notes, paleoData_proxy, paleoData_sensorSpecies, paleoData_units, paleoData_variableName, yearUnits
Saved to /home/jupyter-lluecke/dod2k_v2.0/dod2k/data/fe23/fe23_compact_%s.csv

In [20]:

Copied!

# load dataframe to check that it loads correctly
df = utf.load_compact_dataframe_from_csv('fe23')
# load dataframe to check that it loads correctly
df = utf.load_compact_dataframe_from_csv('fe23')

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2754 entries, 0 to 2753
Data columns (total 21 columns):
 #   Column                         Non-Null Count  Dtype  
---  ------                         --------------  -----  
 0   archiveType                    2754 non-null   object 
 1   dataSetName                    2754 non-null   object 
 2   datasetId                      2754 non-null   object 
 3   geo_meanElev                   2710 non-null   float32
 4   geo_meanLat                    2754 non-null   float32
 5   geo_meanLon                    2754 non-null   float32
 6   geo_siteName                   2754 non-null   object 
 7   interpretation_direction       2754 non-null   object 
 8   interpretation_seasonality     2754 non-null   object 
 9   interpretation_variable        2754 non-null   object 
 10  interpretation_variableDetail  2754 non-null   object 
 11  originalDataURL                2754 non-null   object 
 12  originalDatabase               2754 non-null   object 
 13  paleoData_notes                2754 non-null   object 
 14  paleoData_proxy                2754 non-null   object 
 15  paleoData_sensorSpecies        2754 non-null   object 
 16  paleoData_units                2754 non-null   object 
 17  paleoData_values               2754 non-null   object 
 18  paleoData_variableName         2754 non-null   object 
 19  year                           2754 non-null   object 
 20  yearUnits                      2754 non-null   object 
dtypes: float32(3), object(18)
memory usage: 419.7+ KB
None

Visualise dataframe¶

Show spatial distribution of records, show archive and proxy types

In [21]:

Copied!





# count archive types
archive_count = {}
for ii, at in enumerate(set(df['archiveType'])):
    archive_count[at] = df.loc[df['archiveType']==at, 'archiveType'].count()

sort = np.argsort([cc for cc in archive_count.values()])
archives_sorted = np.array([cc for cc in archive_count.keys()])[sort][::-1]

# Specify colour for each archive (smaller archives get grouped into the same colour)
archive_colour, major_archives, other_archives = uplt.get_archive_colours(archives_sorted, archive_count)

fig = uplt.plot_geo_archive_proxy(df, archive_colour)
utf.save_fig(fig, f'geo_{df.name}', dir=df.name)
# count archive types
archive_count = {}
for ii, at in enumerate(set(df['archiveType'])):
    archive_count[at] = df.loc[df['archiveType']==at, 'archiveType'].count()

sort = np.argsort([cc for cc in archive_count.values()])
archives_sorted = np.array([cc for cc in archive_count.keys()])[sort][::-1]

# Specify colour for each archive (smaller archives get grouped into the same colour)
archive_colour, major_archives, other_archives = uplt.get_archive_colours(archives_sorted, archive_count)

fig = uplt.plot_geo_archive_proxy(df, archive_colour)
utf.save_fig(fig, f'geo_{df.name}', dir=df.name)

0 Wood 2754
saved figure in /home/jupyter-lluecke/dod2k_v2.0/dod2k/figs/fe23/geo_fe23.pdf

$No description has been provided for this image$

Now plot the coverage over the Common Era

In [22]:

Copied!

fig = uplt.plot_coverage(df, archives_sorted, major_archives, other_archives, archive_colour)
utf.save_fig(fig, f'time_{df.name}', dir=df.name)
fig = uplt.plot_coverage(df, archives_sorted, major_archives, other_archives, archive_colour)
utf.save_fig(fig, f'time_{df.name}', dir=df.name)

saved figure in /home/jupyter-lluecke/dod2k_v2.0/dod2k/figs/fe23/time_fe23.pdf

No description has been provided for this image

Display dataframe¶

Display identification metadata: dataSetName, datasetId, originalDataURL, originalDatabase¶

index¶

In [23]:

Copied!

# # check index
print(df.index)
# # check index
print(df.index)

RangeIndex(start=0, stop=2754, step=1)

dataSetName (associated with each record, may not be unique)¶

In [24]:

Copied!





# # check dataSetName
key = 'dataSetName'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
# # check dataSetName
key = 'dataSetName'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))

dataSetName: 
['africa_keny001' 'africa_keny002' 'africa_morc001' ...
 'northamerica_usa_wy034' 'northamerica_usa_wy035'
 'northamerica_usa_wy036']
["<class 'str'>"]

datasetId (unique identifier, as given by original authors, includes original database token)¶

In [25]:

Copied!





# # check datasetId

print(len(df.datasetId.unique()))
print(len(df))
key = 'datasetId'
print('%s (starts with): '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print('datasetId starts with: ', np.unique([str(dd.split('_')[0]) for dd in df[key]]))
# # check datasetId

print(len(df.datasetId.unique()))
print(len(df))
key = 'datasetId'
print('%s (starts with): '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print('datasetId starts with: ', np.unique([str(dd.split('_')[0]) for dd in df[key]]))

2754
2754
datasetId (starts with): 
['FE23_africa_keny001' 'FE23_africa_keny002' 'FE23_africa_morc001' ...
 'FE23_northamerica_usa_wy034' 'FE23_northamerica_usa_wy035'
 'FE23_northamerica_usa_wy036']
["<class 'str'>"]
datasetId starts with:  ['FE23']

originalDataURL (URL/DOI of original published record where available)¶

In [26]:

Copied!





# originalDataURL
key = 'originalDataURL'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([kk for kk in df[key] if 'this' in kk]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# 'this study' should point to the correct URL (PAGES2k)
# originalDataURL
key = 'originalDataURL'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([kk for kk in df[key] if 'this' in kk]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# 'this study' should point to the correct URL (PAGES2k)

originalDataURL: 
['https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/africa/keny001-noaa.rwl'
 'https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/africa/keny002-noaa.rwl'
 'https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/africa/morc001-noaa.rwl'
 ...
 'https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/southamerica/chil016-noaa.rwl'
 'https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/southamerica/chil017-noaa.rwl'
 'https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/southamerica/chil018-noaa.rwl']
[]
["<class 'str'>"]

originalDatabase (original database used as input for dataframe)¶

In [27]:

Copied!





# # originalDataSet
key = 'originalDatabase'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# Note: the last two records have missing URLs
# # originalDataSet
key = 'originalDatabase'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# Note: the last two records have missing URLs

originalDatabase: 
['FE23 (Breitenmoser et al. (2014))']
["<class 'str'>"]

geographical metadata: elevation, latitude, longitude, site name¶

geo_meanElev (mean elevation in m)¶

In [28]:

Copied!





# check Elevation
key = 'geo_meanElev'
print('%s: '%key)
print(df[key])
print(np.unique(['%d'%kk for kk in df[key] if np.isfinite(kk)]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# check Elevation
key = 'geo_meanElev'
print('%s: '%key)
print(df[key])
print(np.unique(['%d'%kk for kk in df[key] if np.isfinite(kk)]))
print(np.unique([str(type(dd)) for dd in df[key]]))

geo_meanElev: 
0       2010.0
1       2010.0
2       2200.0
3       1700.0
4       2200.0
         ...  
2749    2500.0
2750    2542.0
2751    1319.0
2752    2400.0
2753    2378.0
Name: geo_meanElev, Length: 2754, dtype: float32
['0' '1' '10' '100' '1000' '1002' '1005' '1006' '101' '1010' '1020' '1030'
 '1036' '1040' '1047' '105' '1050' '1051' '1052' '1055' '1060' '1065'
 '1067' '107' '1070' '1071' '1075' '108' '1080' '1085' '109' '1090' '1095'
 '1097' '110' '1100' '111' '1110' '1120' '1128' '1130' '1132' '1140'
 '1146' '115' '1150' '1155' '1156' '1158' '116' '1160' '1167' '1169'
 '1170' '1175' '1180' '1194' '12' '120' '1200' '1201' '1206' '1208' '1219'
 '1220' '1224' '1225' '1230' '1231' '1234' '1235' '1237' '1240' '1250'
 '1253' '126' '1260' '1270' '1275' '1280' '1285' '13' '130' '1300' '1302'
 '131' '1310' '1311' '1315' '1317' '1319' '1320' '1325' '1330' '1340'
 '135' '1350' '1354' '136' '1360' '1366' '1367' '1370' '1372' '1375'
 '1377' '138' '1380' '1385' '1390' '1391' '1392' '1395' '14' '140' '1400'
 '1402' '1405' '1410' '1415' '1417' '1418' '1420' '1425' '143' '1432'
 '1433' '1436' '1440' '1448' '145' '1450' '1460' '1463' '1464' '1465'
 '1468' '1469' '1470' '1474' '1475' '1480' '149' '1490' '1493' '1494'
 '1495' '15' '150' '1500' '1510' '152' '1520' '1524' '1525' '153' '1530'
 '1531' '1540' '1545' '1550' '1555' '1560' '1565' '1570' '1580' '1585'
 '1586' '1595' '1596' '1598' '16' '160' '1600' '1601' '1620' '1625' '1630'
 '1633' '164' '1640' '1644' '1645' '165' '1650' '1656' '1658' '1660'
 '1670' '1675' '1676' '168' '1680' '1682' '1690' '1694' '17' '170' '1700'
 '1701' '1706' '1707' '1710' '1720' '1722' '1723' '1725' '1731' '1735'
 '1737' '1740' '175' '1750' '1755' '1760' '1767' '1768' '1770' '1772'
 '1775' '1780' '1785' '1790' '1793' '1798' '18' '180' '1800' '1803' '1804'
 '1811' '1817' '182' '1820' '1825' '1828' '1829' '183' '1830' '1840'
 '1841' '1848' '185' '1850' '1852' '1853' '1859' '1860' '1862' '1870'
 '1875' '188' '1889' '189' '1890' '19' '190' '1900' '1905' '191' '1910'
 '192' '1920' '1921' '1922' '1925' '1938' '194' '1940' '1942' '1945' '195'
 '1950' '1951' '1958' '1960' '1965' '1966' '1969' '197' '1970' '1975'
 '1980' '1981' '199' '1996' '2' '20' '200' '2000' '2002' '2004' '201'
 '2010' '2011' '2012' '2013' '2020' '2024' '2027' '2030' '2042' '205'
 '2050' '2057' '2060' '2065' '207' '2070' '2072' '2073' '2075' '208'
 '2080' '2084' '2085' '209' '2090' '2097' '2098' '210' '2100' '2103' '211'
 '2115' '2118' '2121' '213' '2130' '2133' '2134' '214' '2140' '2142' '215'
 '2150' '2160' '2164' '2165' '217' '2170' '2179' '218' '2180' '2185'
 '2187' '2194' '2195' '2196' '220' '2200' '2210' '2215' '2225' '2229'
 '223' '2242' '225' '2250' '2255' '2256' '2265' '2268' '2270' '2271'
 '2272' '228' '2280' '2284' '2286' '2289' '229' '2290' '230' '2300' '2301'
 '2310' '2316' '232' '2320' '2323' '233' '2332' '2333' '2346' '2347' '235'
 '2350' '2362' '2370' '2375' '2377' '2378' '2380' '2385' '239' '2392'
 '2393' '2394' '24' '240' '2400' '2407' '2408' '2417' '2420' '2423' '243'
 '2438' '244' '2441' '245' '246' '2460' '2465' '2469' '2475' '2484' '2498'
 '2499' '25' '250' '2500' '251' '2514' '2515' '2530' '2535' '2542' '2550'
 '2560' '257' '258' '2580' '259' '2590' '2591' '2592' '26' '260' '2600'
 '2605' '2615' '262' '2621' '2626' '2630' '2636' '2637' '2641' '2645'
 '2650' '2651' '2652' '2658' '267' '2670' '2682' '2688' '2690' '2696'
 '2697' '27' '270' '2700' '2713' '2727' '2730' '2731' '274' '2740' '2741'
 '2743' '2745' '2746' '275' '2750' '2755' '2760' '2774' '2790' '280'
 '2800' '2804' '2805' '2816' '282' '2820' '2828' '2835' '285' '2850'
 '2865' '2877' '2880' '2890' '2894' '2895' '2896' '290' '2900' '291'
 '2925' '2926' '2930' '2940' '295' '2950' '2956' '2960' '297' '2970'
 '2987' '2990' '3' '30' '300' '3000' '3017' '3020' '3025' '3033' '3048'
 '305' '3050' '3065' '307' '308' '3095' '310' '3100' '3110' '3113' '3115'
 '3120' '3125' '314' '3140' '315' '3150' '3154' '317' '3170' '3190' '320'
 '3200' '3208' '321' '3218' '3220' '3221' '3230' '3235' '325' '3250'
 '3261' '3276' '329' '3290' '3291' '330' '3300' '3320' '3330' '335' '3352'
 '3353' '3370' '3378' '339' '340' '3400' '3413' '3415' '342' '3420' '3425'
 '345' '3450' '3470' '3475' '3480' '35' '350' '3500' '3505' '3519' '3535'
 '3536' '354' '355' '3570' '360' '3600' '362' '3630' '366' '3660' '3688'
 '370' '3700' '3719' '3720' '3740' '375' '376' '378' '38' '380' '3800'
 '381' '384' '385' '387' '390' '392' '395' '396' '40' '400' '401' '402'
 '405' '408' '410' '411' '413' '420' '421' '424' '425' '426' '427' '43'
 '430' '438' '44' '440' '442' '443' '445' '45' '450' '455' '457' '459'
 '46' '460' '465' '468' '469' '47' '470' '475' '480' '482' '490' '493'
 '494' '5' '50' '500' '501' '503' '510' '512' '518' '520' '523' '525' '53'
 '530' '535' '540' '55' '550' '555' '558' '56' '560' '564' '570' '575'
 '576' '579' '580' '582' '590' '597' '6' '60' '600' '607' '61' '610' '611'
 '612' '620' '625' '63' '630' '631' '64' '640' '645' '646' '65' '650'
 '658' '660' '67' '670' '672' '675' '68' '680' '690' '7' '70' '700' '701'
 '705' '710' '715' '716' '720' '725' '730' '731' '738' '74' '740' '745'
 '747' '75' '750' '755' '76' '762' '765' '77' '770' '775' '78' '780' '785'
 '790' '792' '798' '8' '80' '800' '803' '805' '808' '810' '820' '822'
 '823' '825' '830' '838' '840' '85' '850' '853' '854' '860' '867' '87'
 '870' '872' '875' '880' '884' '89' '890' '9' '90' '900' '910' '914' '915'
 '918' '920' '923' '925' '929' '930' '940' '945' '95' '950' '952' '960'
 '967' '970' '975' '976' '980' '988' '99' '990' '991' '994' '995']
["<class 'float'>"]

geo_meanLat (mean latitude in degrees N)¶

In [29]:

Copied!





# # Latitude
key = 'geo_meanLat'
print('%s: '%key)
print(np.unique(['%d'%kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# # Latitude
key = 'geo_meanLat'
print('%s: '%key)
print(np.unique(['%d'%kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))

geo_meanLat: 
['-18' '-22' '-23' '-24' '-25' '-26' '-27' '-31' '-32' '-33' '-34' '-35'
 '-36' '-37' '-38' '-39' '-40' '-41' '-42' '-43' '-44' '-45' '-46' '-50'
 '-53' '-54' '-7' '0' '16' '17' '19' '20' '21' '23' '24' '25' '26' '27'
 '28' '29' '30' '31' '32' '33' '34' '35' '36' '37' '38' '39' '40' '41'
 '42' '43' '44' '45' '46' '47' '48' '49' '50' '51' '52' '53' '54' '55'
 '56' '57' '58' '59' '60' '61' '62' '63' '64' '65' '66' '67' '68' '69'
 '70' '71' '72']
["<class 'float'>"]

geo_meanLon (mean longitude)¶

In [30]:

Copied!





# # Longitude 
key = 'geo_meanLon'
print('%s: '%key)
print(np.unique(['%d'%kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# # Longitude 
key = 'geo_meanLon'
print('%s: '%key)
print(np.unique(['%d'%kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))

geo_meanLon: 
['-1' '-100' '-101' '-102' '-103' '-104' '-105' '-106' '-107' '-108'
 '-109' '-110' '-111' '-112' '-113' '-114' '-115' '-116' '-117' '-118'
 '-119' '-120' '-121' '-122' '-123' '-124' '-125' '-126' '-127' '-128'
 '-129' '-130' '-133' '-134' '-135' '-136' '-137' '-138' '-139' '-140'
 '-141' '-142' '-143' '-144' '-145' '-146' '-147' '-148' '-149' '-150'
 '-151' '-152' '-153' '-154' '-159' '-162' '-163' '-2' '-3' '-4' '-5'
 '-58' '-6' '-61' '-62' '-63' '-64' '-65' '-66' '-67' '-68' '-69' '-7'
 '-70' '-71' '-72' '-73' '-74' '-75' '-76' '-77' '-78' '-79' '-8' '-80'
 '-81' '-82' '-83' '-84' '-85' '-86' '-87' '-88' '-89' '-9' '-90' '-91'
 '-92' '-93' '-94' '-95' '-96' '-97' '-98' '-99' '0' '1' '10' '100' '101'
 '103' '104' '105' '106' '107' '109' '11' '110' '111' '112' '114' '115'
 '117' '118' '119' '12' '122' '125' '127' '128' '129' '13' '130' '132'
 '133' '136' '137' '138' '14' '141' '142' '143' '145' '146' '147' '148'
 '149' '15' '150' '151' '153' '154' '155' '158' '159' '16' '160' '163'
 '165' '167' '168' '169' '17' '170' '171' '172' '173' '174' '175' '176'
 '177' '18' '19' '2' '20' '21' '22' '23' '24' '25' '26' '27' '28' '29'
 '30' '31' '32' '33' '34' '35' '36' '37' '4' '41' '42' '43' '44' '45' '5'
 '50' '51' '53' '56' '57' '58' '59' '6' '60' '64' '65' '69' '7' '71' '72'
 '74' '75' '76' '77' '78' '79' '8' '80' '81' '82' '83' '84' '85' '86' '87'
 '88' '89' '9' '90' '91' '93' '94' '95' '97' '98' '99']
["<class 'float'>"]

geo_siteName (name of collection site)¶

In [31]:

Copied!





# Site Name 
key = 'geo_siteName'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
# Site Name 
key = 'geo_siteName'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))

geo_siteName: 
['RagatiForestStationNyeriDistrict' 'RagatiForestStationNyeriDistrict'
 'Tounfite' ... 'DevilsTowerNationalMonument' 'CookingHillside'
 'KretecVale']
["<class 'str'>"]

proxy metadata: archive type, proxy type, interpretation¶

archiveType (archive type)¶

In [32]:

Copied!





# archiveType
key = 'archiveType'
print('%s: '%key)
print(np.unique(df[key]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# archiveType
key = 'archiveType'
print('%s: '%key)
print(np.unique(df[key]))
print(np.unique([str(type(dd)) for dd in df[key]]))

archiveType: 
['Wood']
["<class 'str'>"]

paleoData_proxy (proxy type)¶

In [33]:

Copied!





# paleoData_proxy
key = 'paleoData_proxy'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# paleoData_proxy
key = 'paleoData_proxy'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))

paleoData_proxy: 
['ring width']
["<class 'str'>"]

paleoData_sensorSpecies (further information on proxy type: species)¶

In [34]:

Copied!





# climate_interpretation
key = 'paleoData_sensorSpecies'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# climate_interpretation
key = 'paleoData_sensorSpecies'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))

paleoData_sensorSpecies: 
['ABAL' 'ABAM' 'ABBA' 'ABBO' 'ABCE' 'ABCI' 'ABCO' 'ABLA' 'ABMA' 'ABPI'
 'ABPN' 'ABPR' 'ABSB' 'ABSP' 'ACRU' 'ACSH' 'ADHO' 'ADUS' 'AGAU' 'ARAR'
 'ATCU' 'ATSE' 'AUCH' 'BEPU' 'CABU' 'CADE' 'CADN' 'CARO' 'CDAT' 'CDBR'
 'CDDE' 'CDLI' 'CEAN' 'CESP' 'CHLA' 'CHNO' 'DABI' 'DACO' 'FAGR' 'FASY'
 'FICU' 'FRNI' 'HABI' 'JGAU' 'JUEX' 'JUFO' 'JUOC' 'JUPH' 'JUPR' 'JURE'
 'JUSC' 'JUSP' 'JUVI' 'LADE' 'LAGM' 'LALA' 'LALY' 'LAOC' 'LASI' 'LGFR'
 'LIBI' 'LITU' 'NOBE' 'NOGU' 'NOME' 'NOPU' 'NOSO' 'PCAB' 'PCEN' 'PCGL'
 'PCGN' 'PCMA' 'PCOB' 'PCOM' 'PCPU' 'PCRU' 'PCSH' 'PCSI' 'PCSM' 'PCSP'
 'PHAL' 'PHAS' 'PHGL' 'PHTR' 'PIAL' 'PIAM' 'PIAR' 'PIBA' 'PIBN' 'PIBR'
 'PICE' 'PICL' 'PICO' 'PIEC' 'PIED' 'PIFL' 'PIHA' 'PIHR' 'PIJE' 'PIKO'
 'PILA' 'PILE' 'PILO' 'PIMO' 'PIMU' 'PIMZ' 'PINI' 'PIPA' 'PIPE' 'PIPI'
 'PIPN' 'PIPO' 'PIPU' 'PIRE' 'PIRI' 'PIRO' 'PISF' 'PISI' 'PISP' 'PIST'
 'PISY' 'PITA' 'PITO' 'PIUN' 'PIVI' 'PIWA' 'PLRA' 'PLUV' 'PPDE' 'PPSP'
 'PRMA' 'PSMA' 'PSME' 'PTAN' 'QUAL' 'QUDG' 'QUFR' 'QUHA' 'QUKE' 'QULO'
 'QULY' 'QUMA' 'QUMC' 'QUPE' 'QUPR' 'QURO' 'QURU' 'QUSP' 'QUST' 'QUVE'
 'TABA' 'TADI' 'TAMU' 'TEGR' 'THOC' 'THPL' 'TSCA' 'TSCR' 'TSDU' 'TSHE'
 'TSME' 'ULSP' 'VIKE' 'WICE']
["<class 'str'>"]

paleoData_notes (notes)¶

In [35]:

Copied!





# # paleoData_notes
key = 'paleoData_notes'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
# # paleoData_notes
key = 'paleoData_notes'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))

paleoData_notes: 
['Investigator: Stahle' 'Investigator: Stahle' 'Investigator: Stockton'
 ... 'Investigator: Stambaugh' 'Investigator: King' 'Investigator: King']
["<class 'str'>"]

paleoData_variableName¶

In [36]:

Copied!





# paleoData_variableName
key = 'paleoData_variableName'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# paleoData_variableName
key = 'paleoData_variableName'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))

paleoData_variableName: 
['ring width']
["<class 'str'>"]

climate metadata: interpretation variable, direction, seasonality¶

interpretation_direction¶

In [37]:

Copied!





# climate_interpretation
key = 'interpretation_direction'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
# climate_interpretation
key = 'interpretation_direction'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')

interpretation_direction: 
['N/A']
No. of unique values: 1/2754

interpretation_seasonality¶

In [38]:

Copied!





# climate_interpretation
key = 'interpretation_seasonality'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
# climate_interpretation
key = 'interpretation_seasonality'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')

interpretation_seasonality: 
['N/A']
No. of unique values: 1/2754

interpretation_variable¶

In [39]:

Copied!





# climate_interpretation
key = 'interpretation_variable'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
# climate_interpretation
key = 'interpretation_variable'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')

interpretation_variable: 
['N/A' 'NOT temperature NOT moisture' 'moisture' 'temperature'
 'temperature+moisture']
No. of unique values: 5/2754

interpretation_variableDetail¶

In [40]:

Copied!





# climate_interpretation
key = 'interpretation_variableDetail'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
# climate_interpretation
key = 'interpretation_variableDetail'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')

interpretation_variableDetail: 
['N/A']
No. of unique values: 1/2754

data¶

paleoData_values¶

In [41]:

Copied!





# # paleoData_values
key = 'paleoData_values'

print('%s: '%key)
for ii, vv in enumerate(df[key][:20]):
    try: 
        print('%-30s: %s -- %s'%(df['dataSetName'].iloc[ii][:30], str(np.nanmin(vv)), str(np.nanmax(vv))))
        print(type(vv))
    except: print(df['dataSetName'].iloc[ii], 'NaNs detected.')
print(np.unique([str(type(dd)) for dd in df[key]]))
# # paleoData_values
key = 'paleoData_values'

print('%s: '%key)
for ii, vv in enumerate(df[key][:20]):
    try: 
        print('%-30s: %s -- %s'%(df['dataSetName'].iloc[ii][:30], str(np.nanmin(vv)), str(np.nanmax(vv))))
        print(type(vv))
    except: print(df['dataSetName'].iloc[ii], 'NaNs detected.')
print(np.unique([str(type(dd)) for dd in df[key]]))

paleoData_values: 
africa_keny001                : 0.4 -- 1.423
<class 'numpy.ndarray'>
africa_keny002                : 0.499 -- 1.631
<class 'numpy.ndarray'>
africa_morc001                : -0.014 -- 2.226
<class 'numpy.ndarray'>
africa_morc002                : 0.323 -- 1.587
<class 'numpy.ndarray'>
africa_morc003                : 0.004 -- 1.617
<class 'numpy.ndarray'>
africa_morc011                : 0.005 -- 2.094
<class 'numpy.ndarray'>
africa_morc012                : 0.435 -- 1.866
<class 'numpy.ndarray'>
africa_morc013                : 0.166 -- 1.389
<class 'numpy.ndarray'>
africa_morc014                : -0.025 -- 2.012
<class 'numpy.ndarray'>
africa_safr001                : 0.485 -- 2.129
<class 'numpy.ndarray'>
africa_zimb001                : 0.15 -- 2.415
<class 'numpy.ndarray'>
africa_zimb002                : 0.178 -- 2.044
<class 'numpy.ndarray'>
africa_zimb003                : 0.24 -- 2.701
<class 'numpy.ndarray'>
southamerica_arge             : 0.161 -- 1.867
<class 'numpy.ndarray'>
southamerica_arge001          : 0.336 -- 2.362
<class 'numpy.ndarray'>
southamerica_arge002          : 0.478 -- 1.815
<class 'numpy.ndarray'>
southamerica_arge004          : 0.508 -- 1.714
<class 'numpy.ndarray'>
southamerica_arge005          : 0.313 -- 1.563
<class 'numpy.ndarray'>
southamerica_arge006          : 0.203 -- 1.791
<class 'numpy.ndarray'>
southamerica_arge007          : 0.368 -- 1.652
<class 'numpy.ndarray'>
["<class 'numpy.ndarray'>"]

paleoData_units¶

In [42]:

Copied!





# paleoData_units
key = 'paleoData_units'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# paleoData_units
key = 'paleoData_units'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))

paleoData_units: 
['standardized_anomalies']
["<class 'str'>"]

year¶

In [43]:

Copied!





# # year
key = 'year'
print('%s: '%key)
for ii, vv in enumerate(df[key][:20]):
    try: print('%-30s: %s -- %s'%(df['dataSetName'].iloc[ii][:30], str(np.nanmin(vv)), str(np.nanmax(vv))))
    except: print('NaNs detected.', vv)
print(np.unique([str(type(dd)) for dd in df[key]]))
# # year
key = 'year'
print('%s: '%key)
for ii, vv in enumerate(df[key][:20]):
    try: print('%-30s: %s -- %s'%(df['dataSetName'].iloc[ii][:30], str(np.nanmin(vv)), str(np.nanmax(vv))))
    except: print('NaNs detected.', vv)
print(np.unique([str(type(dd)) for dd in df[key]]))

year: 
africa_keny001                : 1944.0 -- 1993.0
africa_keny002                : 1950.0 -- 1994.0
africa_morc001                : 1360.0 -- 1983.0
africa_morc002                : 1686.0 -- 1984.0
africa_morc003                : 1755.0 -- 1984.0
africa_morc011                : 1598.0 -- 1984.0
africa_morc012                : 1813.0 -- 1984.0
africa_morc013                : 1854.0 -- 1984.0
africa_morc014                : 1200.0 -- 1984.0
africa_safr001                : 1665.0 -- 1976.0
africa_zimb001                : 1925.0 -- 1994.0
africa_zimb002                : 1877.0 -- 1997.0
africa_zimb003                : 1880.0 -- 1996.0
southamerica_arge             : 1900.0 -- 1974.0
southamerica_arge001          : 1605.0 -- 1974.0
southamerica_arge002          : 1800.0 -- 1974.0
southamerica_arge004          : 1532.0 -- 1974.0
southamerica_arge005          : 1641.0 -- 1974.0
southamerica_arge006          : 1449.0 -- 1974.0
southamerica_arge007          : 1579.0 -- 1974.0
["<class 'numpy.ndarray'>"]

yearUnits¶

In [44]:

Copied!





# yearUnits
key = 'yearUnits'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# yearUnits
key = 'yearUnits'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))

yearUnits: 
['CE']
["<class 'str'>"]