Duplicate detection - step 2: review and decide on candidate pairs¶
This notebook runs the second part of the duplicate detection algorithm on a dataframe with the following columns:
archiveType(used for duplicate detection algorithm)dataSetNamedatasetIdgeo_meanElev(used for duplicate detection algorithm)geo_meanLat(used for duplicate detection algorithm)geo_meanLon(used for duplicate detection algorithm)geo_siteName(used for duplicate detection algorithm)interpretation_directioninterpretation_seasonalityinterpretation_variableinterpretation_variableDetailsoriginalDataURLoriginalDatabasepaleoData_notespaleoData_proxy(used for duplicate detection algorithm)paleoData_unitspaleoData_values(used for duplicate detection algorithm, test for correlation, RMSE, correlation of 1st difference, RMSE of 1st difference)paleoData_variableNameyear(used for duplicate detection algorithm)yearUnits
This interactive notebook runs a duplicate decision algorithm for a specific database, following the identification of the potential duplicate candidate pairs. The algorithm walks the operator through each of the detected duplicate candidate pairs from dup_detection.ipynb and runs a decision process to decide whether to keep or reject the identified records.
The confirmed 'true' duplicates are saved in
data/DATABASENAME/duplicate_detection/duplicate_decisions_DATABASENAME_AUTHORINITIALS_YY-MM-DD.csv
10/11/2025 LL: tidied up with revised data organisation and prepared for documentation 27/11/2024 LL: Changed hierarchy FE23>PAGES 2k 22/10/2024 v1: Updated the decision process: - created backup decision file which is intermediately saved - outputs URL which can be copied and pasted into browser - implemented a composite option in the decision process, to create a composite of two records
Author: Lucie Luecke, created 27/9/2024
Note: The algorithm can be either started from scratch or from a backup file:
Intialisation¶
Set up working environment¶
Make sure the repo_root is added correctly, it should be: your_root_dir/dod2k This should be the working directory throughout this notebook (and all other notebooks).
%load_ext autoreload
%autoreload 2
import sys
import os
from pathlib import Path
# Add parent directory to path (works from any notebook in notebooks/)
# the repo_root should be the parent directory of the notebooks folder
current_dir = Path().resolve()
# Determine repo root
if current_dir.name == 'dod2k': repo_root = current_dir
elif current_dir.parent.name == 'dod2k': repo_root = current_dir.parent
else: raise Exception('Please review the repo root structure (see first cell).')
# Update cwd and path only if needed
if os.getcwd() != str(repo_root):
os.chdir(repo_root)
if str(repo_root) not in sys.path:
sys.path.insert(0, str(repo_root))
print(f"Repo root: {repo_root}")
if str(os.getcwd())==str(repo_root):
print(f"Working directory matches repo root. ")
The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload Repo root: /home/jupyter-lluecke/dod2k Working directory matches repo root.
import pandas as pd
import numpy as np
import datetime
from dod2k_utilities import ut_functions as utf # contains utility functions
from dod2k_utilities import ut_duplicate_search as dup # contains utility functions
Load dataset¶
Define the dataset which needs to be screened for duplicates. Input files for the duplicate detection mechanism need to be compact dataframes (pandas dataframes with standardised columns and entry formatting).
The function load_compact_dataframe_from_csv loads the dataframe from a csv file from data\DB\, with DB the name of the database. The database name (db_name) can be
pages2kch2kiso2ksisalfe23
for the individual databases,
or use
all_merged
to load the merged database of all individual databases, or can be any user defined compact dataframe.
# load dataframe
db_name='all_merged'
# db_name='dup_test'
df = utf.load_compact_dataframe_from_csv(db_name)
print(df.info())
df.name = db_name
<class 'pandas.core.frame.DataFrame'> RangeIndex: 5147 entries, 0 to 5146 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 archiveType 5147 non-null object 1 dataSetName 5147 non-null object 2 datasetId 5147 non-null object 3 geo_meanElev 5048 non-null float32 4 geo_meanLat 5147 non-null float32 5 geo_meanLon 5147 non-null float32 6 geo_siteName 5147 non-null object 7 interpretation_direction 5147 non-null object 8 interpretation_seasonality 5147 non-null object 9 interpretation_variable 5147 non-null object 10 interpretation_variableDetail 5147 non-null object 11 originalDataURL 5147 non-null object 12 originalDatabase 5147 non-null object 13 paleoData_notes 5147 non-null object 14 paleoData_proxy 5147 non-null object 15 paleoData_sensorSpecies 5147 non-null object 16 paleoData_units 5147 non-null object 17 paleoData_values 5147 non-null object 18 paleoData_variableName 5147 non-null object 19 year 5147 non-null object 20 yearUnits 5147 non-null object dtypes: float32(3), object(18) memory usage: 784.2+ KB None
Input operator's credentials¶
In order to keep maximum transparency and reproduceability, put in the operator's credentials here.
These details are used to flag the intermediate output files and provided along with the final duplicate free dataset.
initials = 'LL'
fullname = 'Lucie Luecke'
email = 'ljluec1@st-andrews.ac.uk'
# initials = 'MNE'
# fullname = 'Michael Evans'
# email = 'mnevans@umd.edu'
operator_details = [initials, fullname, email]
Duplicate decision process¶
We now start the duplicate decision process.
Hierarchy for duplicate removal for identical duplicates¶
For automated decisions, which apply to identical duplicates, we now define a hierarchy of databases, which decides which record should be kept.
First, list all the original databases:
for db in df.originalDatabase.unique():
print(db)
PAGES 2k v2.2.0 FE23 (Breitenmoser et al. (2014)) CoralHydro2k v1.0.1 Iso2k v1.1.2 SISAL v3
Now assign a hierarchy (importance level) to the original databases. For $n$ original databases the hierarchy ranges from 1, the highest hierarchical value (most important, should always be kept), to the lowest value $n$ (least important, the lowest in the hierarchy).
The operator can choose its own hierarchy by passing a dictionary as a kwarg to the function dup.define_hierarchy, where each database is assigned an interger number according to its importance (0 the most important).
By default the hierarchy uses the novelty of the databases for determining the importance level:
PAGES 2k v2.2.0 > SISAL v3 > CoralHydro2k v1.0.1 > Iso2k v1.1.2 > FE23 (Breitenmoser et al. (2014))
# implement hierarchy for automated decisions for identical records
df = dup.define_hierarchy(df, hierarchy='default')
Chosen hierarchy: 1. PAGES 2k v2.2.0 (highest) 2. SISAL v3 3. CoralHydro2k v1.0.1 4. Iso2k v1.1.2 5. FE23 (Breitenmoser et al. (2014)) (lowest)
# make sure all datasetIds are unique.
assert len(set(df.datasetId))==len(df.datasetId)
assert (df.index==range(0, len(df), 1)).all()
Duplicate decision process¶
The following cell takes you through the potential duplicate candidate pairs and lets you decide whether to
- keep both records
- keep just one record
- delete both records
- create composite of both records.
Recollections and updates of duplicates are automatically selected, as well as identical duplicates following the hierarchy defined above.
The output is saved in data/DATABASENAME/dup_detection/dup_decisions_DATABASENAME_INITIALS_DATE.csv
Summary figures are saved in figs/DATABASENAME/dup_detection/, also linked in the output csv file.
Note: The operator has the option to restart the decision process from a backup file in the directory data/DATABASENAME/dup_detection. This can be especially useful should the connection be interrupted during the process.
You now have the option to implement an automatic choice for specific database combinations. Please also specify a reason!
This is for records which do not satisfy the hierarchy criterion, i.e. records with different data but identical metadata, such as updated records.
If you do not wish to do this, delete automate_db_choice from kwargs or set to False (default).
automate_db_choice = {'preferred_db': 'FE23 (Breitenmoser et al. (2014))',
'rejected_db': 'PAGES 2k v2.2.0',
'reason': 'conservative replication requirement'}
# remove_identicals = True if you want to automatically remove identical duplicates, without operator input
dup.duplicate_decisions_multiple(df, operator_details=operator_details, choose_recollection=True,
remove_identicals=True, backup=True, comment=True, automate_db_choice=automate_db_choice)
No back up. ------------------------------------------------------------ Detected MULTIPLE duplicates, including: pages2k_0 3 ['iso2k_296', 'iso2k_298', 'iso2k_299'] pages2k_81 2 ['ch2k_HE08LRA01_76', 'iso2k_1813'] pages2k_225 2 ['FE23_northamerica_usa_nv512', 'FE23_northamerica_usa_nv521'] pages2k_242 2 ['ch2k_LI06FIJ01_582', 'iso2k_353'] pages2k_267 2 ['iso2k_58', 'iso2k_1068'] pages2k_271 2 ['ch2k_FE18RUS01_492', 'iso2k_1861'] pages2k_317 2 ['ch2k_NA09MAL01_84', 'iso2k_1754'] pages2k_385 2 ['ch2k_FE09OGA01_304', 'iso2k_1922'] pages2k_395 2 ['ch2k_CA07FLI01_400', 'iso2k_1057'] pages2k_409 2 ['ch2k_QU96ESV01_422', 'iso2k_218'] pages2k_444 2 ['pages2k_445', 'pages2k_446'] pages2k_462 2 ['ch2k_OS14UCP01_236', 'iso2k_350'] pages2k_468 2 ['pages2k_3550', 'FE23_asia_russ137w'] pages2k_472 2 ['pages2k_474', 'pages2k_477'] pages2k_495 2 ['ch2k_LI06RAR01_12', 'iso2k_1502'] pages2k_500 2 ['ch2k_AS05GUA01_302', 'iso2k_1559'] pages2k_592 2 ['ch2k_LI06RAR02_270', 'iso2k_1500'] pages2k_831 2 ['pages2k_2220', 'FE23_asia_russ127w'] pages2k_893 2 ['pages2k_895', 'pages2k_900'] pages2k_940 3 ['ch2k_DR99ABR01_264', 'ch2k_DR99ABR01_266', 'iso2k_91'] pages2k_1089 2 ['FE23_northamerica_usa_mt112', 'FE23_northamerica_usa_mt113'] pages2k_1147 3 ['ch2k_DA06MAF01_78', 'ch2k_DA06MAF02_104', 'iso2k_1748'] pages2k_1153 2 ['pages2k_1156', 'pages2k_1160'] pages2k_1360 3 ['ch2k_UR00MAI01_22', 'iso2k_94', 'iso2k_98'] pages2k_1488 4 ['pages2k_1628', 'ch2k_NU11PAL01_52', 'iso2k_505', 'iso2k_579'] pages2k_1703 2 ['ch2k_MO06PED01_226', 'iso2k_629'] pages2k_1750 2 ['iso2k_1856', 'sisal_294.0_194'] pages2k_1859 2 ['ch2k_HE10GUA01_244', 'iso2k_1735'] pages2k_1942 2 ['ch2k_ZI04IFR01_26', 'iso2k_257'] pages2k_2042 2 ['ch2k_TU95MAD01_24', 'iso2k_20'] pages2k_2094 2 ['ch2k_TU01DEP01_450', 'iso2k_1201'] pages2k_2146 2 ['pages2k_2149', 'pages2k_2150'] pages2k_2604 2 ['pages2k_2606', 'iso2k_1481'] pages2k_2607 2 ['pages2k_2609', 'pages2k_2612'] pages2k_2752 2 ['pages2k_2755', 'pages2k_2759'] pages2k_2793 2 ['pages2k_2795', 'pages2k_2798'] pages2k_3028 2 ['pages2k_3030', 'pages2k_3033'] pages2k_3068 2 ['ch2k_ZI14IFR02_522', 'ch2k_ZI14IFR02_524'] pages2k_3085 3 ['ch2k_KU00NIN01_150', 'iso2k_1554', 'iso2k_1556'] pages2k_3132 2 ['ch2k_QU06RAB01_144', 'iso2k_1311'] pages2k_3234 2 ['pages2k_3236', 'pages2k_3239'] pages2k_3266 2 ['ch2k_GO12SBV01_396', 'iso2k_870'] pages2k_3352 3 ['ch2k_ZI14TUR01_480', 'ch2k_ZI14TUR01_482', 'iso2k_302'] pages2k_3372 2 ['ch2k_KI04MCV01_366', 'iso2k_155'] pages2k_3554 2 ['ch2k_LI94SEC01_436', 'iso2k_1124'] pages2k_3599 2 ['iso2k_1069', 'iso2k_1660'] ch2k_KU99HOU01_40 2 ['iso2k_786', 'iso2k_788'] ch2k_XI17HAI01_128 2 ['ch2k_XI17HAI01_136', 'iso2k_1762'] ch2k_HE13MIS01_194 2 ['iso2k_211', 'iso2k_213'] ch2k_PF04PBA01_204 2 ['iso2k_1701', 'iso2k_1704'] ch2k_GU99NAU01_314 2 ['iso2k_702', 'iso2k_705'] ch2k_DE13HAI01_424 2 ['ch2k_DE13HAI01_432', 'iso2k_1643'] iso2k_399 2 ['iso2k_806', 'iso2k_811'] iso2k_1107 2 ['iso2k_1817', 'sisal_271.0_174'] PLEASE PAY ATTENTION WHEN MAKING DECISIONS FOR THESE DUPLICATES! The decision process will go through the duplicates on a PAIR-BY-PAIR basis, which is not optimised for multiple duplicates. The multiples will be highlighted throughout the decision process. Should the operator want to go back and revise a previous decision based on the presentation of a new candidate pair, they can manually modify the backup file to alter any previous decisions. ------------------------------------------------------------ > 1/429,pages2k_0,iso2k_296,0.0,0.9999999995947852 ==================================================================== === POTENTIAL DUPLICATE 0/429: pages2k_0+iso2k_296 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-WDC05A.Steig.2013.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/22531 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_0, keep iso2k_296. write decision to backup file > 2/429,pages2k_0,iso2k_298,0.0,0.9999999995947852 ==================================================================== === POTENTIAL DUPLICATE 1/429: pages2k_0+iso2k_298 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-WDC05A.Steig.2013.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/22531 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_0, keep iso2k_298. write decision to backup file > 3/429,pages2k_0,iso2k_299,0.0,0.9999999995947852 ==================================================================== === POTENTIAL DUPLICATE 2/429: pages2k_0+iso2k_299 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-WDC05A.Steig.2013.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/22531 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_0, keep iso2k_299. write decision to backup file > 4/429,pages2k_6,FE23_northamerica_usa_az555,5.775408685862238,0.978353859816631 ==================================================================== === POTENTIAL DUPLICATE 3/429: pages2k_6+FE23_northamerica_usa_az555 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-MtLemon.Briffa.2002-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/az555-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_6, keep FE23_northamerica_usa_az555. write decision to backup file > 5/429,pages2k_50,FE23_northamerica_canada_cana091,3.197082790629511,0.9674400553180403 ==================================================================== === POTENTIAL DUPLICATE 4/429: pages2k_50+FE23_northamerica_canada_cana091 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-SmithersSkiArea.Schweingruber.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana091-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_50, keep FE23_northamerica_canada_cana091. write decision to backup file > 6/429,pages2k_62,pages2k_63,0.0,0.9442037258051723 ==================================================================== === POTENTIAL DUPLICATE 5/429: pages2k_62+pages2k_63 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-SmithersSkiArea.Schweingruber.1996-2.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-SmithersSkiArea.Schweingruber.1996-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).** Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/005_pages2k_62_pages2k_63__20_21.jpg KEEP BLUE CIRCLES: keep pages2k_62, remove pages2k_63. write decision to backup file > 7/429,pages2k_81,ch2k_HE08LRA01_76,0.0,0.9999999922133574 ==================================================================== === POTENTIAL DUPLICATE 6/429: pages2k_81+ch2k_HE08LRA01_76 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-LosRoques.Hetzinger.2008.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/12891 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_81, keep ch2k_HE08LRA01_76. write decision to backup file > 8/429,pages2k_81,iso2k_1813,0.0,0.9999999922133574 ==================================================================== === POTENTIAL DUPLICATE 7/429: pages2k_81+iso2k_1813 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-LosRoques.Hetzinger.2008.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/12891 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_81, keep iso2k_1813. write decision to backup file > 9/429,pages2k_83,iso2k_1916,0.0,0.9999999999999999 ==================================================================== === POTENTIAL DUPLICATE 8/429: pages2k_83+iso2k_1916 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Arc-Agassiz.Vinther.2008.txt === === URL 2: https://www.ncdc.noaa.gov/paleo-search/study/2431 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_83, keep iso2k_1916. write decision to backup file > 10/429,pages2k_85,pages2k_88,0.0,0.999999996118143 ==================================================================== === POTENTIAL DUPLICATE 9/429: pages2k_85+pages2k_88 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-LagunaChepical.deJong.2013.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-LagunaChepical.deJong.2013.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_85, remove pages2k_88. write decision to backup file > 11/429,pages2k_94,FE23_northamerica_canada_cana153,3.775253335744946,0.996675589148305 ==================================================================== === POTENTIAL DUPLICATE 10/429: pages2k_94+FE23_northamerica_canada_cana153 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-CoppermineRiver.Jacoby.1989.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana153-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_94, keep FE23_northamerica_canada_cana153. write decision to backup file > 12/429,pages2k_107,FE23_northamerica_usa_ak046,3.776010469650001,0.9805747494703706 ==================================================================== === POTENTIAL DUPLICATE 11/429: pages2k_107+FE23_northamerica_usa_ak046 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-KobukNoatak.King.2003.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/ak046-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_107, keep FE23_northamerica_usa_ak046. write decision to backup file > 13/429,pages2k_121,pages2k_122,0.0,0.9815589923346684 ==================================================================== === POTENTIAL DUPLICATE 12/429: pages2k_121+pages2k_122 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-CentralAndes9.Mundo.2014.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-CentralAndes9.Mundo.2014.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_121, remove pages2k_122. write decision to backup file > 14/429,pages2k_132,FE23_northamerica_canada_cana225,3.8914450331074146,0.9956052895756172 ==================================================================== === POTENTIAL DUPLICATE 13/429: pages2k_132+FE23_northamerica_canada_cana225 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-MeadowMountain.Wilson.2005-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana225-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_132, keep FE23_northamerica_canada_cana225. write decision to backup file > 15/429,pages2k_158,FE23_northamerica_usa_wa069,3.6691109671311533,0.9970316184150801 ==================================================================== === POTENTIAL DUPLICATE 14/429: pages2k_158+FE23_northamerica_usa_wa069 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-wa069.Peterson.1994.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/wa069-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_158, keep FE23_northamerica_usa_wa069. write decision to backup file > 16/429,pages2k_171,FE23_northamerica_usa_wy021,6.841598008457665,0.9852616141387495 ==================================================================== === POTENTIAL DUPLICATE 15/429: pages2k_171+FE23_northamerica_usa_wy021 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-PowderRiverPass.Briffa.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/wy021-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_171, keep FE23_northamerica_usa_wy021. write decision to backup file > 17/429,pages2k_203,iso2k_826,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 16/429: pages2k_203+iso2k_826 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-BunakenIsland.Charles.2003.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1903 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_203, keep iso2k_826. write decision to backup file > 18/429,pages2k_225,FE23_northamerica_usa_nv512,4.66349518850797,0.9654504918783551 ==================================================================== === POTENTIAL DUPLICATE 17/429: pages2k_225+FE23_northamerica_usa_nv512 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-PearlPeak.Graybill.1994.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/nv512-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_225, keep FE23_northamerica_usa_nv512. write decision to backup file > 19/429,pages2k_238,iso2k_1044,0.0,0.9999999998644115 ==================================================================== === POTENTIAL DUPLICATE 18/429: pages2k_238+iso2k_1044 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-PlateauRemote.Mosley-Thompson.2013.txt === === URL 2: https://www.ncdc.noaa.gov/paleo-search/study/22479 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_238, keep iso2k_1044. write decision to backup file > 20/429,pages2k_242,ch2k_LI06FIJ01_582,0.5087587517213885,1.0 ==================================================================== === POTENTIAL DUPLICATE 19/429: pages2k_242+ch2k_LI06FIJ01_582 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-SavusavuBayAB.Linsley.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1003973 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_242, keep ch2k_LI06FIJ01_582. write decision to backup file > 21/429,pages2k_242,iso2k_353,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 20/429: pages2k_242+iso2k_353 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-SavusavuBayAB.Linsley.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/16216 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_242, keep iso2k_353. write decision to backup file > 22/429,pages2k_258,iso2k_1498,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 21/429: pages2k_258+iso2k_1498 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Rarotongad18O2R.Linsley.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6089 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_258, keep iso2k_1498. write decision to backup file > 23/429,pages2k_263,iso2k_1322,1.1119762877768709,1.0 ==================================================================== === POTENTIAL DUPLICATE 22/429: pages2k_263+iso2k_1322 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Lombok.Charles.2003.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1903 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_263, keep iso2k_1322. write decision to backup file > 24/429,pages2k_267,iso2k_58,0.5087587517215614,0.9997675035800243 ==================================================================== === POTENTIAL DUPLICATE 23/429: pages2k_267+iso2k_58 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-SavusavuBayFiji.Bagnato.2005.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1881 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
............................................................
- Dataset ID : iso2k_1068
- URL : https://www.ncdc.noaa.gov/paleo/study/1916
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/023_pages2k_267_iso2k_58__99_4352.jpg KEEP BLUE CIRCLES: keep pages2k_267, remove iso2k_58. write decision to backup file > 25/429,pages2k_267,iso2k_1068,0.5087587517215614,0.9999999999999999 ==================================================================== === POTENTIAL DUPLICATE 24/429: pages2k_267+iso2k_1068 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-SavusavuBayFiji.Bagnato.2005.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1916 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_267, keep iso2k_1068. write decision to backup file > 26/429,pages2k_271,ch2k_FE18RUS01_492,1.0009991654937371,1.0 ==================================================================== === POTENTIAL DUPLICATE 25/429: pages2k_271+ch2k_FE18RUS01_492 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-RedSea.Felis.2000.txt === === URL 2: https://doi.pangaea.de/10.1594/PANGAEA.891094 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_271, keep ch2k_FE18RUS01_492. write decision to backup file > 27/429,pages2k_271,iso2k_1861,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 26/429: pages2k_271+iso2k_1861 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-RedSea.Felis.2000.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1861 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_271, keep iso2k_1861. write decision to backup file > 28/429,pages2k_273,FE23_asia_russ130w,0.43810000685217965,0.9680483685544976 ==================================================================== === POTENTIAL DUPLICATE 27/429: pages2k_273+FE23_asia_russ130w === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-AltaiJablonsky.Cook.2000.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/asia/russ130w-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_273, keep FE23_asia_russ130w. write decision to backup file > 29/429,pages2k_281,FE23_northamerica_canada_cana155,4.046549942741809,0.9890405986205609 ==================================================================== === POTENTIAL DUPLICATE 28/429: pages2k_281+FE23_northamerica_canada_cana155 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-HornbyCabin.Jacoby.1989.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana155-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_281, keep FE23_northamerica_canada_cana155. write decision to backup file > 30/429,pages2k_294,FE23_northamerica_usa_ak021,0.9263994682764632,0.9872961006283816 ==================================================================== === POTENTIAL DUPLICATE 29/429: pages2k_294+FE23_northamerica_usa_ak021 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-MinersWell.Wiles.2000.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/ak021-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_294, keep FE23_northamerica_usa_ak021. write decision to backup file > 31/429,pages2k_305,pages2k_309,0.0,0.9999999999999999 ==================================================================== === POTENTIAL DUPLICATE 30/429: pages2k_305+pages2k_309 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-EmeraldBasin.Keigwin.2007-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-EmeraldBasin.Keigwin.2007-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).** Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/030_pages2k_305_pages2k_309__113_115.jpg KEEP RED CROSSES: remove pages2k_305, keep pages2k_309. write decision to backup file > 32/429,pages2k_307,pages2k_311,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 31/429: pages2k_307+pages2k_311 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-EmeraldBasin.Keigwin.2007-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-EmeraldBasin.Keigwin.2007-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).** Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/031_pages2k_307_pages2k_311__114_116.jpg KEEP RED CROSSES: remove pages2k_307, keep pages2k_311. write decision to backup file > 33/429,pages2k_315,iso2k_362,0.392681883222941,1.0 ==================================================================== === POTENTIAL DUPLICATE 32/429: pages2k_315+iso2k_362 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Arc-Austfonna.Isaksson.2005.txt === === URL 2: https://www.ncdc.noaa.gov/paleo-search/study/11173 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_315, keep iso2k_362. write decision to backup file > 34/429,pages2k_317,ch2k_NA09MAL01_84,0.0,0.9999999999999999 ==================================================================== === POTENTIAL DUPLICATE 33/429: pages2k_317+ch2k_NA09MAL01_84 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Malindi.Nakamura.2009.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/12994 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_317, keep ch2k_NA09MAL01_84. write decision to backup file > 35/429,pages2k_317,iso2k_1754,0.0,0.9999999999999999 ==================================================================== === POTENTIAL DUPLICATE 34/429: pages2k_317+iso2k_1754 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Malindi.Nakamura.2009.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/12994 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_317, keep iso2k_1754. write decision to backup file > 36/429,pages2k_323,FE23_northamerica_canada_cana210,1.8532231170308289,0.9489902863469498 ==================================================================== === POTENTIAL DUPLICATE 35/429: pages2k_323+FE23_northamerica_canada_cana210 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-MedusaBay.Buckley.2003.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana210-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_323, keep FE23_northamerica_canada_cana210. write decision to backup file > 37/429,pages2k_385,ch2k_FE09OGA01_304,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 36/429: pages2k_385+ch2k_FE09OGA01_304 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Miyanohama.Felis.2009.txt === === URL 2: https://doi.pangaea.de/10.1594/PANGAEA.743953 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_385, keep ch2k_FE09OGA01_304. write decision to backup file > 38/429,pages2k_385,iso2k_1922,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 37/429: pages2k_385+iso2k_1922 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Miyanohama.Felis.2009.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/8608 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_385, keep iso2k_1922. write decision to backup file > 39/429,pages2k_387,ch2k_FE09OGA01_306,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 38/429: pages2k_387+ch2k_FE09OGA01_306 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Miyanohama.Felis.2009.txt === === URL 2: https://doi.pangaea.de/10.1594/PANGAEA.743953 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_387, keep ch2k_FE09OGA01_306. write decision to backup file > 40/429,pages2k_395,ch2k_CA07FLI01_400,0.0,0.9999999999999998 ==================================================================== === POTENTIAL DUPLICATE 39/429: pages2k_395+ch2k_CA07FLI01_400 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-CoralSea.Calvo.2007.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6087 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_395, keep ch2k_CA07FLI01_400. write decision to backup file > 41/429,pages2k_395,iso2k_1057,0.0,0.9999999999999998 ==================================================================== === POTENTIAL DUPLICATE 40/429: pages2k_395+iso2k_1057 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-CoralSea.Calvo.2007.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6087 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_395, keep iso2k_1057. write decision to backup file > 42/429,pages2k_397,ch2k_CA07FLI01_402,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 41/429: pages2k_397+ch2k_CA07FLI01_402 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-CoralSea.Calvo.2007.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6087 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_397, keep ch2k_CA07FLI01_402. write decision to backup file > 43/429,pages2k_409,ch2k_QU96ESV01_422,1.0734720312673933,1.0 ==================================================================== === POTENTIAL DUPLICATE 42/429: pages2k_409+ch2k_QU96ESV01_422 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Vanuatu.Quinn.1996.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1839 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_409, keep ch2k_QU96ESV01_422. write decision to backup file > 44/429,pages2k_409,iso2k_218,1.0734720312673933,1.0 ==================================================================== === POTENTIAL DUPLICATE 43/429: pages2k_409+iso2k_218 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Vanuatu.Quinn.1996.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1839 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_409, keep iso2k_218. write decision to backup file > 45/429,pages2k_414,pages2k_418,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 44/429: pages2k_414+pages2k_418 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-WestSpitzberg.Bonnet.2010-2.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-WestSpitzberg.Bonnet.2010-3.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_414, remove pages2k_418. write decision to backup file > 46/429,pages2k_417,pages2k_421,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 45/429: pages2k_417+pages2k_421 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-WestSpitzberg.Bonnet.2010-2.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-WestSpitzberg.Bonnet.2010-3.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_417, remove pages2k_421. write decision to backup file > 47/429,pages2k_427,pages2k_433,0.0,0.9830221452780444 ==================================================================== === POTENTIAL DUPLICATE 46/429: pages2k_427+pages2k_433 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-WestSpitzberg.Bonnet.2010-3.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-WestSpitzberg.Bonnet.2010-3.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_427, remove pages2k_433. write decision to backup file > 48/429,pages2k_435,pages2k_842,0.0,0.9855993305813949 ==================================================================== === POTENTIAL DUPLICATE 47/429: pages2k_435+pages2k_842 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UULLWD.Schweingruber.2002.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UULMXD.Schweingruber.2002.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_435, remove pages2k_842. write decision to backup file > 49/429,pages2k_444,pages2k_445,0.0,0.9999996722608564 ==================================================================== === POTENTIAL DUPLICATE 48/429: pages2k_444+pages2k_445 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-LagunaAculeo.vonGunten.2009.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-LagunaAculeo.vonGunten.2009.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_444, remove pages2k_445. write decision to backup file > 50/429,pages2k_444,pages2k_446,0.0,0.9999998768494976 ==================================================================== === POTENTIAL DUPLICATE 49/429: pages2k_444+pages2k_446 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-LagunaAculeo.vonGunten.2009.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-LagunaAculeo.vonGunten.2009.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_444, remove pages2k_446. write decision to backup file > 51/429,pages2k_445,pages2k_446,0.0,0.9999997978711366 ==================================================================== === POTENTIAL DUPLICATE 50/429: pages2k_445+pages2k_446 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-LagunaAculeo.vonGunten.2009.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-LagunaAculeo.vonGunten.2009.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_445, remove pages2k_446. write decision to backup file > 52/429,pages2k_462,ch2k_OS14UCP01_236,0.0,0.9999999861930808 ==================================================================== === POTENTIAL DUPLICATE 51/429: pages2k_462+ch2k_OS14UCP01_236 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-PalauUlongChannel.Osborne.2014.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/16339 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_462, keep ch2k_OS14UCP01_236. write decision to backup file > 53/429,pages2k_462,iso2k_350,0.0,0.9999999861930808 ==================================================================== === POTENTIAL DUPLICATE 52/429: pages2k_462+iso2k_350 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-PalauUlongChannel.Osborne.2014.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/16339 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_462, keep iso2k_350. write decision to backup file > 54/429,pages2k_468,pages2k_3550,0.0,0.987098513022764 ==================================================================== === POTENTIAL DUPLICATE 53/429: pages2k_468+pages2k_3550 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UULTRW.Schweingruber.2002.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UULEWW.Schweingruber.2002.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_468, remove pages2k_3550. write decision to backup file > 55/429,pages2k_468,FE23_asia_russ137w,0.37072945852236205,0.9989192723158098 ==================================================================== === POTENTIAL DUPLICATE 54/429: pages2k_468+FE23_asia_russ137w === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UULTRW.Schweingruber.2002.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/asia/russ137w-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_468, keep FE23_asia_russ137w. write decision to backup file > 56/429,pages2k_472,pages2k_474,0.0,0.9990780776141117 ==================================================================== === POTENTIAL DUPLICATE 55/429: pages2k_472+pages2k_474 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FiskBasin.Richey.2009-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FiskBasin.Richey.2009-1.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_472, remove pages2k_474. write decision to backup file > 57/429,pages2k_472,pages2k_477,0.0,0.9991742362392582 ==================================================================== === POTENTIAL DUPLICATE 56/429: pages2k_472+pages2k_477 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FiskBasin.Richey.2009-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FiskBasin.Richey.2009-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
............................................................
- Dataset ID : pages2k_474
- URL : https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FiskBasin.Richey.2009-1.txt
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/056_pages2k_472_pages2k_477__189_192.jpg KEEP BLUE CIRCLES: keep pages2k_472, remove pages2k_477. write decision to backup file > 58/429,pages2k_474,pages2k_477,0.0,0.9999463761497801 ==================================================================== === POTENTIAL DUPLICATE 57/429: pages2k_474+pages2k_477 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FiskBasin.Richey.2009-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FiskBasin.Richey.2009-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).** --------------------------------------------------------------------------------------------------------- ***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD! The potential duplicates also associated with this record are: - pages2k_472 --------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/057_pages2k_474_pages2k_477__190_192.jpg REMOVE BOTH: remove pages2k_477, remove pages2k_474. write decision to backup file > 59/429,pages2k_478,iso2k_1846,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 58/429: pages2k_478+iso2k_1846 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Arc-Dye.Vinther.2010.txt === === URL 2: http://www.iceandclimate.nbi.ku.dk/data/Vinther_etal_2010_data_02feb2010.xls === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_478, keep iso2k_1846. write decision to backup file > 60/429,pages2k_486,FE23_northamerica_usa_ca609,1.8532231170308289,0.9741227085165554 ==================================================================== === POTENTIAL DUPLICATE 59/429: pages2k_486+FE23_northamerica_usa_ca609 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-FishCreekTrail.Biondi.2001.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/ca609-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_486, keep FE23_northamerica_usa_ca609. write decision to backup file > 61/429,pages2k_495,ch2k_LI06RAR01_12,0.0,0.9999999999999999 ==================================================================== === POTENTIAL DUPLICATE 60/429: pages2k_495+ch2k_LI06RAR01_12 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Rarotongad18O99.Linsley.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6089 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_495, keep ch2k_LI06RAR01_12. write decision to backup file > 62/429,pages2k_495,iso2k_1502,0.0,0.9999999999999999 ==================================================================== === POTENTIAL DUPLICATE 61/429: pages2k_495+iso2k_1502 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Rarotongad18O99.Linsley.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6089 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_495, keep iso2k_1502. write decision to backup file > 63/429,pages2k_500,ch2k_AS05GUA01_302,0.0243685480819044,1.0 ==================================================================== === POTENTIAL DUPLICATE 62/429: pages2k_500+ch2k_AS05GUA01_302 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-DoubleReef.Asami.2005.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1915 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_500, keep ch2k_AS05GUA01_302. write decision to backup file > 64/429,pages2k_500,iso2k_1559,0.0243685480819044,1.0 ==================================================================== === POTENTIAL DUPLICATE 63/429: pages2k_500+iso2k_1559 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-DoubleReef.Asami.2005.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1915 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_500, keep iso2k_1559. write decision to backup file > 65/429,pages2k_541,iso2k_404,0.5619803469445591,1.0 ==================================================================== === POTENTIAL DUPLICATE 64/429: pages2k_541+iso2k_404 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-MES.Rhodes.2012.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/13175 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).** Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/064_pages2k_541_iso2k_404__216_4431.jpg KEEP RED CROSSES: remove pages2k_541, keep iso2k_404. write decision to backup file > 66/429,pages2k_543,pages2k_976,0.0,0.9932155933107505 ==================================================================== === POTENTIAL DUPLICATE 65/429: pages2k_543+pages2k_976 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UKHMXD.Schweingruber.2002.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UKHLWD.Schweingruber.2002.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_543, remove pages2k_976. write decision to backup file > 67/429,pages2k_565,iso2k_998,0.0,0.9999999999999998 ==================================================================== === POTENTIAL DUPLICATE 66/429: pages2k_565+iso2k_998 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Arc-PennyIceCapP96.Fisher.1998.txt === === URL 2: www.ncdc.noaa.gov/paleo/study/2474 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_565, keep iso2k_998. write decision to backup file > 68/429,pages2k_583,FE23_northamerica_usa_mt116,4.265038441214223,0.9940277703282503 ==================================================================== === POTENTIAL DUPLICATE 67/429: pages2k_583+FE23_northamerica_usa_mt116 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-FlintCreekRange.Hughes.2005.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/mt116-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_583, keep FE23_northamerica_usa_mt116. write decision to backup file > 69/429,pages2k_592,ch2k_LI06RAR02_270,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 68/429: pages2k_592+ch2k_LI06RAR02_270 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Rarotongad18O3R.Linsley.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6089 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_592, keep ch2k_LI06RAR02_270. write decision to backup file > 70/429,pages2k_592,iso2k_1500,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 69/429: pages2k_592+iso2k_1500 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Rarotongad18O3R.Linsley.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6089 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_592, keep iso2k_1500. write decision to backup file > 71/429,pages2k_610,iso2k_1199,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 70/429: pages2k_610+iso2k_1199 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-Ferrigno.Thomas.2013.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/22477 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_610, keep iso2k_1199. write decision to backup file > 72/429,pages2k_626,FE23_northamerica_usa_wa071,3.6691109671311533,0.9790599696794638 ==================================================================== === POTENTIAL DUPLICATE 71/429: pages2k_626+FE23_northamerica_usa_wa071 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-wa071.Peterson.1994.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/wa071-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_626, keep FE23_northamerica_usa_wa071. write decision to backup file > 73/429,pages2k_691,FE23_northamerica_canada_cana062,3.8470954123399754,0.935513152519886 ==================================================================== === POTENTIAL DUPLICATE 72/429: pages2k_691+FE23_northamerica_canada_cana062 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-LacRomanel.Schweingruber.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana062-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_691, keep FE23_northamerica_canada_cana062. write decision to backup file > 74/429,pages2k_730,iso2k_396,5.414343588593776,1.0 ==================================================================== === POTENTIAL DUPLICATE 73/429: pages2k_730+iso2k_396 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Eur-SpannagelCave.Mangini.2005.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/5433 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_730, keep iso2k_396. write decision to backup file > 75/429,pages2k_736,FE23_northamerica_usa_wy024,3.2176480591281234,0.9937765458543829 ==================================================================== === POTENTIAL DUPLICATE 74/429: pages2k_736+FE23_northamerica_usa_wy024 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-GranitePassHuntMountain.Briffa.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/wy024-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_736, keep FE23_northamerica_usa_wy024. write decision to backup file > 76/429,pages2k_800,FE23_northamerica_canada_cana234,4.413706727317208,0.9890384832658766 ==================================================================== === POTENTIAL DUPLICATE 75/429: pages2k_800+FE23_northamerica_canada_cana234 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-BigWhite2.Wilson.2005-2.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana234-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_800, keep FE23_northamerica_canada_cana234. write decision to backup file > 77/429,pages2k_818,iso2k_488,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 76/429: pages2k_818+iso2k_488 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-Puruogangri.Thompson.2006.txt === === URL 2: nan === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_818, keep iso2k_488. write decision to backup file > 78/429,pages2k_827,pages2k_830,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 77/429: pages2k_827+pages2k_830 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-CH07-98-MC-22.Saenger.2011-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-CH07-98-MC-22.Saenger.2011-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_827, remove pages2k_830. write decision to backup file > 79/429,pages2k_831,pages2k_2220,0.0,0.9910287300605741 ==================================================================== === POTENTIAL DUPLICATE 78/429: pages2k_831+pages2k_2220 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UKHTRW.Schweingruber.2002.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UKHEWW.Schweingruber.2002.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_831, remove pages2k_2220. write decision to backup file > 80/429,pages2k_831,FE23_asia_russ127w,0.23755598128809655,0.9938897185424319 ==================================================================== === POTENTIAL DUPLICATE 79/429: pages2k_831+FE23_asia_russ127w === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Asi-UKHTRW.Schweingruber.2002.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/asia/russ127w-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_831, keep FE23_asia_russ127w. write decision to backup file > 81/429,pages2k_857,FE23_northamerica_usa_ut511,4.779603504757881,0.9930030347228277 ==================================================================== === POTENTIAL DUPLICATE 80/429: pages2k_857+FE23_northamerica_usa_ut511 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-CeaderBreaks.Briffa.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/ut511-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_857, keep FE23_northamerica_usa_ut511. write decision to backup file > 82/429,pages2k_881,iso2k_1010,7.831443798552019,0.9999999999999999 ==================================================================== === POTENTIAL DUPLICATE 81/429: pages2k_881+iso2k_1010 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Malindi.Cole.2000.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1855 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_881, keep iso2k_1010. write decision to backup file > 83/429,pages2k_893,pages2k_895,0.0,0.997878731888924 ==================================================================== === POTENTIAL DUPLICATE 82/429: pages2k_893+pages2k_895 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FeniDrift.Richter.2009-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FeniDrift.Richter.2009-1.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_893, remove pages2k_895. write decision to backup file > 84/429,pages2k_893,pages2k_900,0.0,0.997878731888924 ==================================================================== === POTENTIAL DUPLICATE 83/429: pages2k_893+pages2k_900 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FeniDrift.Richter.2009-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FeniDrift.Richter.2009-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_893, remove pages2k_900. write decision to backup file > 85/429,pages2k_895,pages2k_900,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 84/429: pages2k_895+pages2k_900 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FeniDrift.Richter.2009-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FeniDrift.Richter.2009-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_895, remove pages2k_900. write decision to backup file > 86/429,pages2k_940,ch2k_DR99ABR01_264,0.0,0.9999868748284537 ==================================================================== === POTENTIAL DUPLICATE 85/429: pages2k_940+ch2k_DR99ABR01_264 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-AbrahamReef.Druffel.1999.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1911 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_940, keep ch2k_DR99ABR01_264. write decision to backup file > 87/429,pages2k_940,ch2k_DR99ABR01_266,0.0,0.9999868748284537 ==================================================================== === POTENTIAL DUPLICATE 86/429: pages2k_940+ch2k_DR99ABR01_266 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-AbrahamReef.Druffel.1999.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1911 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_940, keep ch2k_DR99ABR01_266. write decision to backup file > 88/429,pages2k_940,iso2k_91,0.0,0.9999868748284537 ==================================================================== === POTENTIAL DUPLICATE 87/429: pages2k_940+iso2k_91 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-AbrahamReef.Druffel.1999.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1911 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_940, keep iso2k_91. write decision to backup file > 89/429,pages2k_945,iso2k_100,0.0,0.999999999878647 ==================================================================== === POTENTIAL DUPLICATE 88/429: pages2k_945+iso2k_100 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-CoastalDML.Thamban.2012.txt === === URL 2: https://www.ncdc.noaa.gov/paleo-search/study/22589 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_945, keep iso2k_100. write decision to backup file > 90/429,pages2k_960,iso2k_641,1.4788881660628725,1.0 ==================================================================== === POTENTIAL DUPLICATE 89/429: pages2k_960+iso2k_641 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-FloridaBay.Swart.1996.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1886 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_960, keep iso2k_641. write decision to backup file > 91/429,pages2k_982,FE23_northamerica_usa_or042,4.592357698872727,0.9964512838580512 ==================================================================== === POTENTIAL DUPLICATE 90/429: pages2k_982+FE23_northamerica_usa_or042 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-CraterLakeNE.Briffa.2002-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/or042-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_982, keep FE23_northamerica_usa_or042. write decision to backup file > 92/429,pages2k_1004,iso2k_644,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 91/429: pages2k_1004+iso2k_644 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-DomeF1993.Uemura.2014.txt === === URL 2: https://www.ncdc.noaa.gov/paleo-search/study/22471 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1004, keep iso2k_644. write decision to backup file > 93/429,pages2k_1026,FE23_northamerica_usa_az553,3.706446234061719,0.989435703033374 ==================================================================== === POTENTIAL DUPLICATE 92/429: pages2k_1026+FE23_northamerica_usa_az553 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-SnowBowlSanFranciscoPeak.Briffa.2002-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/az553-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1026, keep FE23_northamerica_usa_az553. write decision to backup file > 94/429,pages2k_1048,iso2k_1212,0.08120329462182123,0.9999999999444483 ==================================================================== === POTENTIAL DUPLICATE 93/429: pages2k_1048+iso2k_1212 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-VLG.Bertler.2011.txt === === URL 2: https://doi.pangaea.de/10.1594/PANGAEA.866368 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1048, keep iso2k_1212. write decision to backup file > 95/429,pages2k_1089,FE23_northamerica_usa_mt112,1.3032489760048571,0.9906780912896159 ==================================================================== === POTENTIAL DUPLICATE 94/429: pages2k_1089+FE23_northamerica_usa_mt112 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-YellowMountainRidge.King.2002.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/mt112-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1089, keep FE23_northamerica_usa_mt112. write decision to backup file > 96/429,pages2k_1089,FE23_northamerica_usa_mt113,1.3032489760048571,0.8759857906162318 ==================================================================== === POTENTIAL DUPLICATE 95/429: pages2k_1089+FE23_northamerica_usa_mt113 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-YellowMountainRidge.King.2002.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/mt113-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1089, keep FE23_northamerica_usa_mt113. write decision to backup file > 97/429,pages2k_1108,iso2k_1060,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 96/429: pages2k_1108+iso2k_1060 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-BermudaSouthShore.Goodkin.2008-1.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/6115 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1108, keep iso2k_1060. write decision to backup file > 98/429,pages2k_1116,FE23_northamerica_canada_cana170w,2.963034008750589,0.9012973226223439 ==================================================================== === POTENTIAL DUPLICATE 97/429: pages2k_1116+FE23_northamerica_canada_cana170w === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-Athabasca.Schweingruber.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana170w-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1116, keep FE23_northamerica_canada_cana170w. write decision to backup file > 99/429,pages2k_1147,ch2k_DA06MAF01_78,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 98/429: pages2k_1147+ch2k_DA06MAF01_78 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Mafia.Damassa.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10808 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
............................................................
- Dataset ID : ch2k_DA06MAF02_104
- URL : https://www.ncdc.noaa.gov/paleo/study/10808
............................................................
- Dataset ID : iso2k_1748
- URL : https://www.ncdc.noaa.gov/paleo/study/10808
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/098_pages2k_1147_ch2k_DA06MAF01_78__428_4147.jpg KEEP BLUE CIRCLES: keep pages2k_1147, remove ch2k_DA06MAF01_78. write decision to backup file > 100/429,pages2k_1147,ch2k_DA06MAF02_104,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 99/429: pages2k_1147+ch2k_DA06MAF02_104 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Mafia.Damassa.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10808 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
............................................................
- Dataset ID : ch2k_DA06MAF01_78
- URL : https://www.ncdc.noaa.gov/paleo/study/10808
............................................................
- Dataset ID : iso2k_1748
- URL : https://www.ncdc.noaa.gov/paleo/study/10808
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/099_pages2k_1147_ch2k_DA06MAF02_104__428_4153.jpg KEEP BLUE CIRCLES: keep pages2k_1147, remove ch2k_DA06MAF02_104. write decision to backup file > 101/429,pages2k_1147,iso2k_1748,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 100/429: pages2k_1147+iso2k_1748 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Mafia.Damassa.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10808 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1147, keep iso2k_1748. write decision to backup file > 102/429,pages2k_1153,pages2k_1156,0.0,0.9967946398784733 ==================================================================== === POTENTIAL DUPLICATE 101/429: pages2k_1153+pages2k_1156 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-ODP984.Came.2007-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-ODP984.Came.2007-1.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_1153, remove pages2k_1156. write decision to backup file > 103/429,pages2k_1153,pages2k_1160,0.0,0.9969118556506823 ==================================================================== === POTENTIAL DUPLICATE 102/429: pages2k_1153+pages2k_1160 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-ODP984.Came.2007-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-ODP984.Came.2007-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_1153, remove pages2k_1160. write decision to backup file > 104/429,pages2k_1156,pages2k_1160,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 103/429: pages2k_1156+pages2k_1160 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-ODP984.Came.2007-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-ODP984.Came.2007-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_1156, remove pages2k_1160. write decision to backup file > 105/429,pages2k_1209,FE23_northamerica_usa_co553,4.686190390758007,0.9848164514971338 ==================================================================== === POTENTIAL DUPLICATE 104/429: pages2k_1209+FE23_northamerica_usa_co553 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-PikePeaks.Harlan.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/co553-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1209, keep FE23_northamerica_usa_co553. write decision to backup file > 106/429,pages2k_1252,FE23_northamerica_canada_cana096,5.559669351093426,0.9976191515076486 ==================================================================== === POTENTIAL DUPLICATE 105/429: pages2k_1252+FE23_northamerica_canada_cana096 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-SunwaptaPass.Schweingruber.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana096-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1252, keep FE23_northamerica_canada_cana096. write decision to backup file > 107/429,pages2k_1274,iso2k_1577,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 106/429: pages2k_1274+iso2k_1577 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Arc-GRIP.Vinther.2010.txt === === URL 2: https://doi.pangaea.de/10.1594/PANGAEA.786354 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1274, keep iso2k_1577. write decision to backup file > 108/429,pages2k_1293,iso2k_821,0.0,0.9999999993592802 ==================================================================== === POTENTIAL DUPLICATE 107/429: pages2k_1293+iso2k_821 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ant-TALDICE.Stenni.2010.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/22502 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1293, keep iso2k_821. write decision to backup file > 109/429,pages2k_1325,FE23_northamerica_usa_wy030,4.634071826744492,0.9841375865267528 ==================================================================== === POTENTIAL DUPLICATE 108/429: pages2k_1325+FE23_northamerica_usa_wy030 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-SheepTrail.Brown.2005-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/wy030-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1325, keep FE23_northamerica_usa_wy030. write decision to backup file > 110/429,pages2k_1360,ch2k_UR00MAI01_22,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 109/429: pages2k_1360+ch2k_UR00MAI01_22 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Maiana.Urban.2000.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1859 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1360, keep ch2k_UR00MAI01_22. write decision to backup file > 111/429,pages2k_1360,iso2k_94,7.450067481225913,0.999999993382913 ==================================================================== === POTENTIAL DUPLICATE 110/429: pages2k_1360+iso2k_94 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Maiana.Urban.2000.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1859 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1360, keep iso2k_94. write decision to backup file > 112/429,pages2k_1360,iso2k_98,7.450067481225913,0.999999993382913 ==================================================================== === POTENTIAL DUPLICATE 111/429: pages2k_1360+iso2k_98 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Maiana.Urban.2000.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1859 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1360, keep iso2k_98. write decision to backup file > 113/429,pages2k_1362,pages2k_1365,0.0,0.9992389529344313 ==================================================================== === POTENTIAL DUPLICATE 112/429: pages2k_1362+pages2k_1365 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-GulfofGuinea.Weldeab.2007-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-GulfofGuinea.Weldeab.2007-1.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: True data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation). Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_1362, remove pages2k_1365. write decision to backup file > 114/429,pages2k_1370,iso2k_1619,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 113/429: pages2k_1370+iso2k_1619 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Arc-NGRIP1.Vinther.2006.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/8700 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1370, keep iso2k_1619. write decision to backup file > 115/429,pages2k_1420,FE23_northamerica_canada_cana111,2.210833453585409,0.9185079638747947 ==================================================================== === POTENTIAL DUPLICATE 114/429: pages2k_1420+FE23_northamerica_canada_cana111 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-VancouverCyprusProvincialPark.Briffa.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana111-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1420, keep FE23_northamerica_canada_cana111. write decision to backup file > 116/429,pages2k_1442,pages2k_1444,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 115/429: pages2k_1442+pages2k_1444 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-LaurentianFan.Keigwin.2005-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-LaurentianFan.Keigwin.2005-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_1442, remove pages2k_1444. write decision to backup file > 117/429,pages2k_1488,pages2k_1628,0.0,0.9999312398195644 ==================================================================== === POTENTIAL DUPLICATE 116/429: pages2k_1488+pages2k_1628 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Nurhati.2011-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Cobb.2003.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
............................................................
- Dataset ID : ch2k_NU11PAL01_52
- URL : https://www.ncdc.noaa.gov/paleo/study/10373
............................................................
- Dataset ID : iso2k_505
- URL : https://www.ncdc.noaa.gov/paleo/study/1875
............................................................
- Dataset ID : iso2k_579
- URL : https://www.ncdc.noaa.gov/paleo/study/10373
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/116_pages2k_1488_pages2k_1628__542_595.jpg KEEP RED CROSSES: remove pages2k_1488, keep pages2k_1628. write decision to backup file > 118/429,pages2k_1488,ch2k_NU11PAL01_52,0.4697858835846662,0.9992710430546085 ==================================================================== === POTENTIAL DUPLICATE 117/429: pages2k_1488+ch2k_NU11PAL01_52 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Nurhati.2011-1.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10373 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1488, keep ch2k_NU11PAL01_52. write decision to backup file > 119/429,pages2k_1488,iso2k_505,0.0,0.9976024758754877 ==================================================================== === POTENTIAL DUPLICATE 118/429: pages2k_1488+iso2k_505 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Nurhati.2011-1.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1875 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
............................................................
- Dataset ID : pages2k_1628
- URL : https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Cobb.2003.txt
............................................................
- Dataset ID : ch2k_NU11PAL01_52
- URL : https://www.ncdc.noaa.gov/paleo/study/10373
............................................................
- Dataset ID : iso2k_579
- URL : https://www.ncdc.noaa.gov/paleo/study/10373
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/118_pages2k_1488_iso2k_505__542_4456.jpg REMOVE BOTH: remove iso2k_505, remove pages2k_1488. write decision to backup file > 120/429,pages2k_1488,iso2k_579,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 119/429: pages2k_1488+iso2k_579 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Nurhati.2011-1.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10373 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1488, keep iso2k_579. write decision to backup file > 121/429,pages2k_1490,ch2k_NU11PAL01_54,0.4697858835846662,0.999992275602027 ==================================================================== === POTENTIAL DUPLICATE 120/429: pages2k_1490+ch2k_NU11PAL01_54 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Nurhati.2011-1.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10373 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: True RECORDS IDENTICAL (perfect correlation) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1490, keep ch2k_NU11PAL01_54. write decision to backup file > 122/429,pages2k_1491,iso2k_575,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 121/429: pages2k_1491+iso2k_575 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Nurhati.2011-2.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10373 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1491, keep iso2k_575. write decision to backup file > 123/429,pages2k_1497,iso2k_1885,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 122/429: pages2k_1497+iso2k_1885 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/SAm-QuelccayaIceCap.Thompson.2013.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/14174 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1497, keep iso2k_1885. write decision to backup file > 124/429,pages2k_1515,pages2k_1519,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 123/429: pages2k_1515+pages2k_1519 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-SouthChinaSea.Zhao.2006-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-SouthChinaSea.Zhao.2006-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #1. KEEP BLUE CIRCLES: keep pages2k_1515, remove pages2k_1519. write decision to backup file > 125/429,pages2k_1520,pages2k_1522,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 124/429: pages2k_1520+pages2k_1522 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-SubTropicalEasternNorthAtlantic.deMenocal.2000-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-SubTropicalEasternNorthAtlantic.deMenocal.2000-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).** Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/124_pages2k_1520_pages2k_1522__553_554.jpg KEEP RED CROSSES: remove pages2k_1520, keep pages2k_1522. write decision to backup file > 126/429,pages2k_1547,iso2k_259,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 125/429: pages2k_1547+iso2k_259 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Kiritimati.Evans.1998.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1847 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) True metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: True correlation_perfect: True RECORDS IDENTICAL (identical data) except for metadata. Automatically choose #2. KEEP RED CROSSES: remove pages2k_1547, keep iso2k_259. write decision to backup file > 127/429,pages2k_1566,FE23_northamerica_canada_cana231,4.390485654920984,0.9977556148240536 ==================================================================== === POTENTIAL DUPLICATE 126/429: pages2k_1566+FE23_northamerica_canada_cana231 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-ParkMountain.Wilson.2005-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/canada/cana231-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1566, keep FE23_northamerica_canada_cana231. write decision to backup file > 128/429,pages2k_1605,FE23_northamerica_usa_ca606,3.985226443144479,0.9693885470026882 ==================================================================== === POTENTIAL DUPLICATE 127/429: pages2k_1605+FE23_northamerica_usa_ca606 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-SpillwayLakeYosemiteNationalPark.King.2000.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/ca606-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False False (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1605, keep FE23_northamerica_usa_ca606. write decision to backup file > 129/429,pages2k_1619,pages2k_1623,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 128/429: pages2k_1619+pages2k_1623 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-JacafFjord.Seplveda.2009-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-JacafFjord.Seplveda.2009-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).** Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/128_pages2k_1619_pages2k_1623__590_592.jpg KEEP RED CROSSES: remove pages2k_1619, keep pages2k_1623. write decision to backup file > 130/429,pages2k_1628,ch2k_NU11PAL01_52,0.4697858835846662,0.9999312398195646 ==================================================================== === POTENTIAL DUPLICATE 129/429: pages2k_1628+ch2k_NU11PAL01_52 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Cobb.2003.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10373 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
- pages2k_1488
............................................................
- Dataset ID : iso2k_505
- URL : https://www.ncdc.noaa.gov/paleo/study/1875
............................................................
- Dataset ID : iso2k_579
- URL : https://www.ncdc.noaa.gov/paleo/study/10373
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/129_pages2k_1628_ch2k_NU11PAL01_52__595_4138.jpg KEEP BLUE CIRCLES: keep pages2k_1628, remove ch2k_NU11PAL01_52. write decision to backup file > 131/429,pages2k_1628,iso2k_505,0.0,0.9973624471178902 ==================================================================== === POTENTIAL DUPLICATE 130/429: pages2k_1628+iso2k_505 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Cobb.2003.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/1875 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
- pages2k_1488
............................................................
- Dataset ID : ch2k_NU11PAL01_52
- URL : https://www.ncdc.noaa.gov/paleo/study/10373
............................................................
- Dataset ID : iso2k_579
- URL : https://www.ncdc.noaa.gov/paleo/study/10373
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/130_pages2k_1628_iso2k_505__595_4456.jpg KEEP BLUE CIRCLES: keep pages2k_1628, remove iso2k_505. write decision to backup file > 132/429,pages2k_1628,iso2k_579,0.0,0.9999312398195646 ==================================================================== === POTENTIAL DUPLICATE 131/429: pages2k_1628+iso2k_579 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-Palmyra.Cobb.2003.txt === === URL 2: https://www.ncdc.noaa.gov/paleo/study/10373 === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: False lat True lon True elevation False archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).**
---------------------------------------------------------------------------------------------------------
***ATTENTION*** THIS RECORD IS ASSOCIATED WITH MULTIPLE DUPLICATES! PLEASE PAY SPECIAL ATTENTION WHEN MAKING DECISIONS FOR THIS RECORD!
The potential duplicates also associated with this record are:
- pages2k_1488
............................................................
- Dataset ID : ch2k_NU11PAL01_52
- URL : https://www.ncdc.noaa.gov/paleo/study/10373
............................................................
- Dataset ID : iso2k_505
- URL : https://www.ncdc.noaa.gov/paleo/study/1875
--------------------------------------------------------------------------------------------------------- Before inputting your decision. Would you like to leave a comment on your decision process?
saved figure in /home/jupyter-lluecke/dod2k/figs//dup_detection/all_merged/131_pages2k_1628_iso2k_579__595_4482.jpg KEEP BLUE CIRCLES: keep pages2k_1628, remove iso2k_579. write decision to backup file > 133/429,pages2k_1636,FE23_northamerica_usa_wa081,6.762315219649745,0.9980630247518526 ==================================================================== === POTENTIAL DUPLICATE 132/429: pages2k_1636+FE23_northamerica_usa_wa081 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/NAm-MtStHelens.Briffa.1996-1.txt === === URL 2: https://www.ncei.noaa.gov/pub/data/paleo/treering/measurements/northamerica/usa/wa081-noaa.rwl === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: False URL_identical: False data_identical: False correlation_perfect: False Automated choice. Metadata identical, automatically choose FE23 (Breitenmoser et al. (2014)) over PAGES 2k v2.2.0. conservative replication requirement KEEP RED CROSSES: remove pages2k_1636, keep FE23_northamerica_usa_wa081. write decision to backup file > 134/429,pages2k_1686,pages2k_1688,0.0,1.0 ==================================================================== === POTENTIAL DUPLICATE 133/429: pages2k_1686+pages2k_1688 === === URL 1: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-ArabianSea.Doose-Rolinski.2001-1.txt === === URL 2: https://www1.ncdc.noaa.gov/pub/data/paleo/pages2k/pages2k-temperature-v2-2017/data-version-2.0.0/Ocn-ArabianSea.Doose-Rolinski.2001-2.txt === True if pot_dup_corrs[i_pot_dups]>=0.98 else False True (len(time_1)==len(time_2)) False metadata_identical: True lat True lon True elevation True archivetype True paleodata_proxy True sites_identical: True URL_identical: False data_identical: False correlation_perfect: False
**Decision required for this duplicate pair (see figure above).** Before inputting your decision. Would you like to leave a comment on your decision process?
date = str(datetime.datetime.utcnow())[2:10]
fn = utf.find(f'dup_decisions_{df.name}_{initials}_{date}.csv', f'data/{df.name}/dup_detection')
if fn != []:
print('----------------------------------------------------')
print('Successfully finished the duplicate decision process!'.upper())
print('----------------------------------------------------')
print('Saved the decision output file in:')
print()
for ff in fn:
print('%s.'%ff)
print()
print('You are now able to proceed with the next notebook: dup_removal.ipynb')
else:
print('!!!!!!!!!!!!WARNING!!!!!!!!!!!')
print('Final output file is missing.')
print()
print('Please re-run the notebook to complete duplicate decision process.')