Get Provenance and Annotation of Data and Processes 7th International Provenance and Annotation Workshop IPAW 2018 London UK July 9 10 2018 Proceedings Khalid Belhajjame Free All Chapters

Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Download and Read online, DOWNLOAD EBOOK, [PDF EBOOK EPUB ], Ebooks

download, Read Ebook EPUB/KINDE, Download Book Format PDF

Provenance and Annotation of Data and Processes


7th International Provenance and Annotation
Workshop IPAW 2018 London UK July 9 10 2018
Proceedings Khalid Belhajjame

OR CLICK LINK
https://textbookfull.com/product/provenance-and-
annotation-of-data-and-processes-7th-
international-provenance-and-annotation-workshop-
ipaw-2018-london-uk-july-9-10-2018-proceedings-
khalid-belhajjame/
Read with Our Free App Audiobook Free Format PFD EBook, Ebooks dowload PDF
with Andible trial, Real book, online, KINDLE , Download[PDF] and Read and Read
Read book Format PDF Ebook, Dowload online, Read book Format PDF Ebook,
[PDF] and Real ONLINE Dowload [PDF] and Real ONLINE
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Provenance and Annotation of Data and Processes 5th


International Provenance and Annotation Workshop IPAW
2014 Cologne Germany June 9 13 2014 Revised Selected
Papers 1st Edition Bertram Ludäscher
https://textbookfull.com/product/provenance-and-annotation-of-
data-and-processes-5th-international-provenance-and-annotation-
workshop-ipaw-2014-cologne-germany-june-9-13-2014-revised-
selected-papers-1st-edition-bertram-ludascher/

Virtual Reality and Augmented Reality 15th EuroVR


International Conference EuroVR 2018 London UK October
22 23 2018 Proceedings Patrick Bourdot

https://textbookfull.com/product/virtual-reality-and-augmented-
reality-15th-eurovr-international-conference-eurovr-2018-london-
uk-october-22-23-2018-proceedings-patrick-bourdot/

Spectral and High Order Methods for Partial


Differential Equations ICOSAHOM 2018 Selected Papers
from the ICOSAHOM Conference London UK July 9 13 2018
Spencer J. Sherwin
https://textbookfull.com/product/spectral-and-high-order-methods-
for-partial-differential-equations-icosahom-2018-selected-papers-
from-the-icosahom-conference-london-uk-july-9-13-2018-spencer-j-
sherwin/

Telematics and Computing 7th International Congress


WITCOM 2018 Mazatlán Mexico November 5 9 2018
Proceedings Miguel Felix Mata-Rivera

https://textbookfull.com/product/telematics-and-computing-7th-
international-congress-witcom-2018-mazatlan-mexico-
november-5-9-2018-proceedings-miguel-felix-mata-rivera/
Data Analytics 31st British International Conference on
Databases BICOD 2017 London UK July 10 12 2017
Proceedings 1st Edition Andrea Calì

https://textbookfull.com/product/data-analytics-31st-british-
international-conference-on-databases-bicod-2017-london-uk-
july-10-12-2017-proceedings-1st-edition-andrea-cali/

Software Process Improvement and Capability


Determination 18th International Conference SPICE 2018
Thessaloniki Greece October 9 10 2018 Proceedings
Ioannis Stamelos
https://textbookfull.com/product/software-process-improvement-
and-capability-determination-18th-international-conference-
spice-2018-thessaloniki-greece-october-9-10-2018-proceedings-
ioannis-stamelos/

Case Based Reasoning Research and Development 26th


International Conference ICCBR 2018 Stockholm Sweden
July 9 12 2018 Proceedings Michael T. Cox

https://textbookfull.com/product/case-based-reasoning-research-
and-development-26th-international-conference-
iccbr-2018-stockholm-sweden-july-9-12-2018-proceedings-michael-t-
cox/

Logic, Language, Information, and Computation: 24th


International Workshop, WoLLIC 2017, London, UK, July
18-21, 2017, Proceedings 1st Edition Juliette Kennedy

https://textbookfull.com/product/logic-language-information-and-
computation-24th-international-workshop-wollic-2017-london-uk-
july-18-21-2017-proceedings-1st-edition-juliette-kennedy/

Computational Data and Social Networks 7th


International Conference CSoNet 2018 Shanghai China
December 18 20 2018 Proceedings Xuemin Chen

https://textbookfull.com/product/computational-data-and-social-
networks-7th-international-conference-csonet-2018-shanghai-china-
december-18-20-2018-proceedings-xuemin-chen/
Khalid Belhajjame
Ashish Gehani
Pinar Alper (Eds.)

Provenance
LNCS 11017

and Annotation of Data


and Processes
7th International Provenance
and Annotation Workshop, IPAW 2018
London, UK, July 9–10, 2018, Proceedings

123
Lecture Notes in Computer Science 11017
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, Lancaster, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Zurich, Switzerland
John C. Mitchell
Stanford University, Stanford, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
C. Pandu Rangan
Indian Institute of Technology Madras, Chennai, India
Bernhard Steffen
TU Dortmund University, Dortmund, Germany
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/7409
Khalid Belhajjame Ashish Gehani

Pinar Alper (Eds.)

Provenance
and Annotation of Data
and Processes
7th International Provenance
and Annotation Workshop, IPAW 2018
London, UK, July 9–10, 2018
Proceedings

123
Editors
Khalid Belhajjame Pinar Alper
Paris Dauphine University University of Luxembourg
Paris Belvaux
France Luxembourg
Ashish Gehani
SRI International
Menlo Park, CA
USA

ISSN 0302-9743 ISSN 1611-3349 (electronic)


Lecture Notes in Computer Science
ISBN 978-3-319-98378-3 ISBN 978-3-319-98379-0 (eBook)
https://doi.org/10.1007/978-3-319-98379-0

Library of Congress Control Number: 2018951244

LNCS Sublibrary: SL3 – Information Systems and Applications, incl. Internet/Web, and HCI

© Springer Nature Switzerland AG 2018


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, express or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

This volume contains the proceedings of the 7th International Provenance and
Annotation Workshop (IPAW), held during July 9–10, 2018, at King’s College in
London, UK. For the third time, IPAW was co-located with the Workshop on the
Theory and Practice of Provenance (TaPP). Together, the two leading provenance
workshops anchored Provenance Week 2018, a full week of provenance-related
activities that included a shared poster session and three other workshops on algorithm
accountability, incremental re-computation, and security. The proceedings of IPAW
include 12 long papers that report in-depth the results of research around provenance,
two system demonstration papers, and 19 poster papers.
IPAW 2018 provided a rich program with a variety of provenance-related topics
ranging from the capture and inference of provenance to its use and application. Since
provenance is a key ingredient to enable reproducibility, several papers have investi-
gated means for enabling dataflow steering and process re-computation. The modeling
of provenance and its simulation has been the subject of a number of papers, which
tackled issues that seek, among other things, to model provenance in software engi-
neering activities or to use provenance to model aspects of the European Union General
Data Protection Regulation. Other papers investigated inference techniques to propa-
gate beliefs in provenance graphs, efficiently update RDF graphs, mine similarities
between processes, and discover workflow schema-level dependencies. This year’s
program also featured extensions of the W3C Prov recommendation to support new
features, e.g., versioning of mutable entities, or cater for new domain knowledge, e.g.,
astronomy.
In closing, we would like to thank the members of the Program Committee for their
thoughtful reviews, Vasa Curcin and Simon Miles for the local organization of IPAW
and the Provenance Week at King’s College, London, and the authors and participants
for making IPAW a successful event.

June 2018 Khalid Belhajjame


Ashish Gehani
Pinar Alper
Organization

Program Committee
Pinar Alper University of Luxembourg, Luxembourg
Ilkay Altintas SDSC, USA
David Archer Galois, Inc., USA
Khalid Belhajjame University of Paris-Dauphine, France
Vanessa Braganholo UFF, Brazil
Kevin Butler University of Florida, USA
Sarah Cohen-Boulakia LRI, University of Paris-Sud, France
Oscar Corcho Universidad Politécnica de Madrid, Spain
Vasa Curcin King’s College London, UK
Susan Davidson University of Pennsylvania, USA
Daniel de Oliveira Fluminense Federal University, Brazil
Saumen Dey University of California, Davis, USA
Alban Gaignard CNRS, France
Daniel Garijo Information Sciences Institute, USA
Ashish Gehani SRI International, USA
Paul Groth Elsevier Labs, The Netherlands
Trung Dong Huynh King’s College London, UK
Grigoris Karvounarakis LogicBlox, Greece
David Koop University of Massachusetts Dartmouth, USA
Bertram Ludaescher University of Illinois at Urbana-Champaign, USA
Tanu Malik University of Chicago, USA
Marta Mattoso Federal University of Rio de Janeiro, Brazil
Deborah McGuinness Rensselaer Polytechnic Institute (RPI), USA
Simon Miles King’s College London, UK
Paolo Missier Newcastle University, UK
Luc Moreau King’s College London, UK
Beth Plale Indiana University Bloomington, USA
Satya Sahoo Case Western Reserve University, USA
Stian Soiland-Reyes The University of Manchester, UK
Jun Zhao University of Oxford, UK

Additional Reviewers

Carvalho, Lucas Augusto Pimentel, João


Montalvão Costa Rashid, Sabbir
Cała, Jacek Souza, Renan
Chagas, Clayton Yan, Rui
Contents

Reproducibility

Provenance Annotation and Analysis to Support Process Re-computation. . . . 3


Jacek Cała and Paolo Missier

Provenance of Dynamic Adaptations in User-Steered Dataflows . . . . . . . . . . 16


Renan Souza and Marta Mattoso

Classification of Provenance Triples for Scientific Reproducibility:


A Comparative Evaluation of Deep Learning Models
in the ProvCaRe Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Joshua Valdez, Matthew Kim, Michael Rueschman, Susan Redline,
and Satya S. Sahoo

Modeling, Simulating and Capturing Provenance

A Provenance Model for the European Union General Data


Protection Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Benjamin E. Ujcich, Adam Bates, and William H. Sanders

Automating Provenance Capture in Software Engineering


with UML2PROV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Carlos Sáenz-Adán, Luc Moreau, Beatriz Pérez, Simon Miles,
and Francisco J. García-Izquierdo

Simulated Domain-Specific Provenance . . . . . . . . . . . . . . . . . . . . . . . . . . . 71


Pinar Alper, Elliot Fairweather, and Vasa Curcin

PROV Extensions

Versioned-PROV: A PROV Extension to Support Mutable Data Entities . . . . 87


João Felipe N. Pimentel, Paolo Missier, Leonardo Murta,
and Vanessa Braganholo

Using the Provenance from Astronomical Workflows to Increase


Processing Efficiency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Michael A. C. Johnson, Luc Moreau, Adriane Chapman,
Poshak Gandhi, and Carlos Sáenz-Adán
VIII Contents

Scientific Workflows

Discovering Similar Workflows via Provenance Clustering: A Case Study . . . 115


Abdussalam Alawini, Leshang Chen, Susan Davidson, Stephen Fisher,
and Junhyong Kim

Validation and Inference of Schema-Level Workflow


Data-Dependency Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Shawn Bowers, Timothy McPhillips, and Bertram Ludäscher

Applications

Belief Propagation Through Provenance Graphs . . . . . . . . . . . . . . . . . . . . . 145


Belfrit Victor Batlajery, Mark Weal, Adriane Chapman, and Luc Moreau

Using Provenance to Efficiently Propagate SPARQL Updates


on RDF Source Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Iman Naja and Nicholas Gibbins

System Demonstrations

Implementing Data Provenance in Health Data Analytics Software . . . . . . . . 173


Shen Xu, Elliot Fairweather, Toby Rogers, and Vasa Curcin

Quine: A Temporal Graph System for Provenance Storage and Analysis . . . . 177
Ryan Wright

Joint IPAW/TaPP Poster Session

Capturing Provenance for Runtime Data Analysis in Computational Science


and Engineering Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Vítor Silva, Renan Souza, Jose Camata, Daniel de Oliveira,
Patrick Valduriez, Alvaro L. G. A. Coutinho, and Marta Mattoso

UniProv - Provenance Management for UNICORE Workflows


in HPC Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
André Giesler, Myriam Czekala, and Björn Hagemeier

Towards a PROV Ontology for Simulation Models . . . . . . . . . . . . . . . . . . . 192


Andreas Ruscheinski, Dragana Gjorgevikj, Marcus Dombrowsky,
Kai Budde, and Adelinde M. Uhrmacher

Capturing the Provenance of Internet of Things Deployments . . . . . . . . . . . . 196


David Corsar, Milan Markovic, and Peter Edwards
Contents IX

Towards Transparency of IoT Message Brokers . . . . . . . . . . . . . . . . . . . . . 200


Milan Markovic, David Corsar, Waqar Asif, Peter Edwards,
and Muttukrishnan Rajarajan

Provenance-Based Root Cause Analysis for Revenue Leakage Detection:


A Telecommunication Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Wisam Abbasi and Adel Taweel

Case Base Reasoning Decision Support Using the DecPROV


Ontology for Decision Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Nicholas J. Car

Bottleneck Patterns in Provenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212


Sara Boutamina, James D. A. Millington, and Simon Miles

Architecture for Template-Driven Provenance Recording . . . . . . . . . . . . . . . 217


Elliot Fairweather, Pinar Alper, Talya Porat, and Vasa Curcin

Combining Provenance Management and Schema Evolution. . . . . . . . . . . . . 222


Tanja Auge and Andreas Heuer

Provenance for Entity Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226


Sarah Oppold and Melanie Herschel

Where Provenance in Database Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 231


Alexander Rasin, Tanu Malik, James Wagner, and Caleb Kim

Streaming Provenance Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236


Raza Ahmad, Melanie Bru, and Ashish Gehani

Structural Analysis of Whole-System Provenance Graphs. . . . . . . . . . . . . . . 241


Jyothish Soman, Thomas Bytheway, Lucian Carata,
Nikilesh D. Balakrishnan, Ripduman Sohan, and Robert N. M. Watson

A Graph Testing Framework for Provenance Network Analytics . . . . . . . . . . 245


Bernard Roper, Adriane Chapman, David Martin, and Jeremy Morley

Provenance for Astrophysical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252


Anastasia Galkin, Kristin Riebe, Ole Streicher, Francois Bonnarel,
Mireille Louys, Michèle Sanguillon, Mathieu Servillat,
and Markus Nullmeier

Data Provenance in Agriculture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257


Sérgio Manuel Serra da Cruz, Marcos Bacis Ceddia,
Renan Carvalho Tàvora Miranda, Gabriel Rizzo, Filipe Klinger,
Renato Cerceau, Ricardo Mesquita, Ricardo Cerceau,
Elton Carneiro Marinho, Eber Assis Schmitz, Elaine Sigette,
and Pedro Vieira Cruz
X Contents

Extracting Provenance Metadata from Privacy Policies . . . . . . . . . . . . . . . . 262


Harshvardhan Jitendra Pandit, Declan O’Sullivan, and Dave Lewis

Provenance-Enabled Stewardship of Human Data in the GDPR Era. . . . . . . . 266


Pinar Alper, Regina Becker, Venkata Satagopam, Christophe Trefois,
Valentin Grouès, Jacek Lebioda, and Yohan Jarosz

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271


Reproducibility
Provenance Annotation and Analysis
to Support Process Re-computation

Jacek Cala(B) and Paolo Missier

School of Computing, Newcastle University, Newcastle upon Tyne, UK


{Jacek.Cala,Paolo.Missier}@ncl.ac.uk

Abstract. Many resource-intensive analytics processes evolve over time


following new versions of the reference datasets and software dependen-
cies they use. We focus on scenarios in which any version change has
the potential to affect many outcomes, as is the case for instance in high
throughput genomics where the same process is used to analyse large
cohorts of patient genomes, or cases. As any version change is unlikely
to affect the entire population, an efficient strategy for restoring the cur-
rency of the outcomes requires first to identify the scope of a change, i.e.,
the subset of affected data products. In this paper we describe a generic
and reusable provenance-based approach to address this scope discovery
problem. It applies to a scenario where the process consists of complex
hierarchical components, where different input cases are processed using
different version configurations of each component, and where separate
provenance traces are collected for the executions of each of the com-
ponents. We show how a new data structure, called a restart tree, is
computed and exploited to manage the change scope discovery problem.

Keywords: Provenance annotations · Process re-computation

1 Introduction
Consider data analytics processes that exhibit the following characteristics. C1:
are resource-intensive and thus expensive when repeatedly executed over time,
i.e., on a cloud or HPC cluster; C2: require sophisticated implementations to run
efficiently, such as workflows with a nested structure; C3: depend on multiple
reference datasets and software libraries and tools, some of which are versioned
and evolve over time; C4: apply to a possibly large population of input instances.
This is not an uncommon set of characteristics. A prime example is data
processing for high throughput genomics, where the genomes (or exomes) of
a cohort of patient cases are processed, individually or in batches, to produce
lists of variants (genetic mutations) that form the basis for a number of diag-
nostic purposes. These variant calling and interpretation pipelines take batches
of 20–40 patient exomes and require hundreds of CPU-hours to complete (C1).
Initiatives like the 100K Genome project in the UK (www.genomicsengland.co.
uk) provide a perspective on the scale of the problem (C4).
c Springer Nature Switzerland AG 2018
K. Belhajjame et al. (Eds.): IPAW 2018, LNCS 11017, pp. 3–15, 2018.
https://doi.org/10.1007/978-3-319-98379-0_1
4 J. Cala and P. Missier

Fig. 1. A typical variant discovery pipeline processing a pool of input samples. Each
step is usually implemented as a workflow or script that combines a number of tools
run in parallel.

Figure 1, taken from our prior work [5], shows the nested workflow structure
(C2) of a typical variant calling pipeline based on the GATK (Genomics Analysis
Toolkit) best practices from the Broad Institute.1 Each task in the pipeline relies
on some GATK (or other open source) tool, which in turn requires lookups in
public reference datasets. For most of these processes and reference datasets new
versions are issued periodically or on an as-needed basis (C3). The entire pipeline
may be variously implemented as a HPC cluster script or workflow. Each single
run of the pipeline creates a hierarchy of executions which are distributed across
worker nodes and coordinated by the orchestrating top-level workflow or script
(cf. the “Germline Variant Discovery” workflow depicted in the figure).
Upgrading one or more of the versioned elements risks invalidating previ-
ously computed knowledge outcomes, e.g. the sets of variants associated with
patient cases. Thus, a natural reaction to a version change in a dependency is to
upgrade the pipeline and then re-process all the cases. However, as we show in
the example at the end of this section, not all version changes affect each case
equally, or in a way that completely invalidates prior outcomes. Also, within each
pipeline execution only some of the steps may be affected. We therefore need a
system that can perform more selective re-processing in reaction to a change. In
[6] we have described our initial results in developing such a system for selective
re-computation over a population of cases in reaction to changes, called ReComp.
ReComp is a meta-process designed to detect the scope of a single change or of
a combination of changes, estimate the impact of those changes on the popula-
tion in scope, prioritise the cases for re-processing, and determine the minimal
amount of re-processing required for each of those cases. Note that, while ide-
ally the process of upgrading P is controlled by ReComp, in reality we must also
account for upgrades of P that are performed “out-of-band” by developers, as
we have assumed in our problem formulation.

1
https://software.broadinstitute.org/gatk/best-practices.
Provenance Annotation and Analysis 5

Fig. 2. Schematic of the ReComp meta-process.

Briefly, ReComp consists of the macro-steps shown in Fig. 2. The work pre-
sented in this paper is instrumental to the ReComp design, as it addresses the
very first step (S1) indicated in the figure, in a way that is generic and agnostic
to the type of process and data.

1.1 Version Changes and Their Scope


To frame the problem addressed in the rest of the paper, we introduce a simple
model for version changes as triggers for re-computation. Consider an abstract
process P and a population X = {x1 . . . xN } of inputs to P , referred to as cases.
Let D = [D1 . . . Dm ] be an ordered list of versioned dependencies. These are
components, typically software libraries or reference data sets, which are used
by P to process a case. Each D has a version, denoted D.v, with a total order
on the sequence of versions D.v < D.v  < D.v  < . . . for each D.
An execution configuration for P is the vector V = [v1 . . . vm ] of version
numbers for [D1 . . . Dm ]. Typically, these are the latest versions for each D, but
configurations where some D is “rolled back” to an older version are possible.
The set of total orders on the versions of each D ∈ D induce a partial order on
the set of configurations:
[v1 . . . vm ] ≺ [v1 . . . vm

] iff {vi ≤ vi }i:1...m and vi < vi for at least one vi .
We denote an execution of P on input xi ∈ X using configuration V by
E = P (x, V ), where P may consist of multiple components {P1 . . . Pk }, such
as those in our example pipeline. When this is the case, we assume for gen-
erality that one execution P (x, V ) given x and V is realised as a collection
{Ei = Pi (x, V )}i:1...k of separate executions, one for each Pi . We use the W3C
PROV [13] and ProvONE [7] abstract vocabularies to capture this model in
which: P, P1 . . . Pk are all instances of provone:Program, their relationships is
expressed as
{provone:hasSubProgram(P, Pi )}i:1...k
and each execution Ei is associated with its program Pi using:
{wasAssociatedWith(Ei , , Pi )}i:1...k
6 J. Cala and P. Missier

Version Change Events. We use PROV derivation statements prov:was-


DerivedFrom to denote a version change event C for some Di , from vi to vi
wDF
: C = {D.vi −−−→ D.vi }. Given V = [v1 . . . vi . . . vm ], C enables the new config-
uration V  = [v1 . . . vi . . . vm ], meaning that V  can be applied to P , so that its
future executions are of form E = P (x, V  ).
We model sequences of changes by assuming that an unbound stream of
change events C1 , C2 , . . . can be observed over time, either for different or the
same Di . A re-processing system may react to each change individually. However,
we assume the more general model where a set of changes accumulates into a
window (according to some criteria, for instance fixed-time) and is processed as a
batch. Thus, by extension, we define a composite change to be a set of elementary
changes that are part of the same window. Given V = [v1 . . . vi . . . vj . . . vm ], we
wDF wDF
say that C = {D.vi −−−→ D.vi , D.vj −−−→ D.vj , . . . } enables configuration
V  = [v1 . . . vi . . . vj . . . vm ]. Importantly, all change events, whether individual
or accumulated into windows, are merged together into the single change front
CF which is the configuration of the latest versions of all changed artefacts.
Applying CF to E = P (x, V ) involves re-processing x using P to bring the
outcomes up-to-date with respect to all versions in the change front. For instance,
given V = [v1 , v2 , v3 ] and the change front CF = {v1 , v2 }, the re-execution of
E = P (x, [v1 , v2 , v3 ]) is E  = P (x, [v1 , v2 , v3 ]). It is important to keep track of
how elements of the change front are updated as it may be possible to avoid
rerunning some of P ’s components for which the configuration has not changed.
Without this fine-grained derivation information, each new execution may use
the latest versions but cannot be easily optimised using partial re-processing.
Clearly, processing change events as a batch is more efficient than pro-
cessing each change separately, cf. E  = P (x, [v1 , v2 , v3 ]) followed by E  =
P (x, [v1 , v2 , v3 ]) with the example above. But a model that manages change
events as a batch is also general in that it accommodates a variety of refresh
strategies. For example, applying changes that are known to have limited impact
on the outcomes can be delayed until a sufficient number of other changes have
accumulated into CF , or until a specific high-impact change event has occurred.
A discussion of specific strategies that are enabled by our scope discovery algo-
rithm is out of the scope of this paper.

1.2 Problem Formulation and Contributions


Suppose P has been executed h times for some x ∈ X, each time with a different
configuration V1 . . . Vh . The collection of past executions, for each x ∈ X, is:

{E(Pi , x, Vj )i:1...k,j:1...h,x∈X } (1)

The problem we address in this paper is to identify, for each change front
CF , the smallest set of those executions that are affected by CF . We call this the
re-computation front C relative to CF . We address this problem in a complex
general setting where many types of time-interleaved changes are allowed, where
many configurations are enabled by any of these changes, and where executions
Provenance Annotation and Analysis 7

may reflect any of these configurations, and in particular individual cases x may
be processed using any such different configurations. The example from the next
section illustrates how this setting can manifest itself in practice.
Our main contribution is a generic algorithm for discovering re-computation
front that applies to a range of processes, from simple black-box, single compo-
nent programs where P is indivisible, to complex hierarchical workflows where P
consists of subprograms Pi which may itself be defined in terms of subprograms.
Following a tradition from the literature to use provenance as a means
to address re-computation [2,6,12], our approach also involves collecting and
exploiting both execution provenance for each E, as well as elements of process–
subprocess dependencies as mentioned above. To the best of our knowledge this
particular use of provenance and the algorithm have not been proposed before.

1.3 Example: Versioning in Genomics


The problem of version change emerges concretely in Genomics pipelines in which
changes have different scope, both within each process instance and across the
population of cases. For example, an upgrade to the bwa aligner tool directly
affects merely the alignment task but its impact may propagate to most of
the tasks downstream. Conversely, an upgrade in the human reference genome
directly affects the majority of the tasks. In both cases, however, the entire
population of executions is affected because current alignment algorithms are
viewed as “black boxes” that use the entire reference genome.
However, a change in one of the other reference databases that are queried for
specific information only affects those cases where some of the changed records
are part of a query result. One example is ClinVar, a popular variant database
queried to retrieve information about specific diseases (phenotypes). In this case,
changes that affect one phenotype will not impact cases that exhibit a completely
different phenotype. But to detect the impact ReComp uses steps (S2) and (S3),
which is out of scope of this paper.
Additionally, note that version changes in this Genomics example occur with
diverse frequency. For instance, the reference genome is updated twice a years,
alignment libraries every few months, and ClinVar every month.

2 Recomputation Fronts and Restart Trees


2.1 Recomputation Fronts
In Sect. 1.1 we have introduced a partial order V ≺ V  between process con-
figurations. In particular, given V , if a change C enables V  then by definition
V ≺ V  . Note that this order induces a corresponding partial order between any
two executions that operate on the same x ∈ X.
P (x, V ) = E  E  = P (x, V  ) iff V ≺ V  (2)
This order is important, because optimising re-execution, i.e. executing P (x, V  ),
may benefit most from the provenance associated with the latest execution
8 J. Cala and P. Missier

according to the sequence of version changes, which is E = P (x, V ) (a dis-


cussion on the precise types of such optimisations can be found in [6]). For this
reason in our implementation we keep track of the execution order explicitly
using the wasInformedBy PROV relationship, i.e. we record PROV statement
wIB
E  −−→ E whenever re-executing E such that E  E  .
To see how these chains of ordered executions may evolve consider, for
instance, E0 = P (x1 , [a1 , b1 ]), E1 = P (x2 , [a1 , b1 ]) for inputs x1 , x2 respectively,
where the a and b are versions for two dependencies D1 , D2 . The situation is
wDF
depicted in Fig. 3/left. When change C1 = {a2 −−−→ a1 } occurs, it is possible
that only x1 is re-processed, but not x2 . This may happen, for example, when
D1 is a data dependency and the change affects parts of the data which were not
used by E1 in the processing of input x2 . In this case, C would trigger one single
new execution: E2 = P (x1 , [a2 , b1 ]) where we record the ordering E0  E2 . The
new state is depicted in Fig. 3/middle.

Fig. 3. The process of annotating re-execution following a sequence of events; in bold


are executions on the re-computation front; a- and b-axis represent the artefact deriva-
tion; arrows in blue denote the wasInformedBy relation. (Color figure online)

wDF wDF
Now consider the new change C2 = {a3 −−−→ a2 , b2 −−−→ b1 }, affecting both
D1 and D2 , and suppose both x1 and x2 are going to be re-processed. Then,
for each x we retrieve the latest executions that are affected by the change,
in this case E2 , E1 , as their provenance may help optimising the re-processing
of x1 , x2 using the new change front {a3 , b2 }. After re-processing we have two
new executions: E3 = P (x1 , [a3 , b2 ]), E4 = P (x2 , [a3 , b2 ]) which may have been
optimised using E2 , E1 , respectively, as indicated by their ordering: E3  E2 ,
E4  E1 (see Fig. 3/right).
To continue with the example, let us now assume that the provenance for
a new execution: E5 = P (x1 , [a1 , b2 ]) appears in the system. This may have
been triggered by an explicit user action independently from our re-processing
system. Note that the user has disregarded the fact that the latest version of
ai is a3 . The corresponding scenario is depicted in Fig. 4/left. We now have two
executions for x1 with two configurations. Note that despite E0  E5 holds it
wIB
is not reflected by a corresponding E5 −−→ E0 in our re-computation system
Provenance Annotation and Analysis 9

Fig. 4. Continuation of Fig. 3; in bold are executions on the re-computation front; a-


and b-axis represent the artefact derivation; arrows in blue denote the wasInformedBy
relation. (Color figure online)

because E5 was an explicit user action. However, consider another change event:
wDF
{b3 −−−→ b2 }. For x2 , the affected executions is E4 , as this is the single latest
execution in the ordering recorded so far for x2 . But for x1 there are now two
executions that need to be brought up-to-date, E3 and E5 , as these are the
maximal elements in the set of executions for x1 relative according to the order:
E0  E2  E3 , E0  E5 . We call these executions the recomputation front for
x1 relative to change front {a3 , b3 }, in this case.
This situation, depicted in Fig. 4/right, illustrates the most general case
where the entire set of previous executions need to be considered when re-
processing an input with a new configuration. Note that the two independent
executions E3 and E5 have merged into the new E6 .
Formally, the recomputation front for x ∈ X and for a change front CF =
{w1 . . . wk }, k ≤ m is the set of maximal executions E = P (x, [v1 . . . vm ]) where
vi ≤ wi for 1 ≤ i ≤ m.

2.2 Building a Restart Tree

Following our goal to develop a generic re-computation meta-process, the front


finding algorithm needs to support processes of various complexity – from the
simplest black-box processes to complex hierarchical workflows mentioned ear-
lier. This requirement adds another dimension to the problem of the identifica-
tion of the re-computation front.
If process P has a hierarchical structure, e.g. expressed using the provone:
hasSubProgram statement (cf. Sect. 1.1), one run of P will usually result in a
collection of executions. These are logically organised into a hierarchy, where the
top-level represents the execution of the program itself, and sub-executions (con-
nected via provone:wasPartOf) represent the executions of the sub-programs.
Following the principle of the separation of concerns, we assume the general case
where the top-level program is not aware of the data and software dependen-
cies of its parts. Thus, discovering which parts of the program used a particular
dependency requires traversing the entire hierarchy of executions.
10 J. Cala and P. Missier

To illustrate this problem let us focus on a small part of our pipeline – the
alignment step (Align Sample and Align Lane). Figure 5 shows this step modelled
using ProvONE. P0 denotes the top program – the Align Sample workflow, SP 0
is the Align Lane subprogram, SSP 0 –SSP 3 represent the subsub-programs of
bioinformatic tools like bwa and samtools, while SP 1 –SP 3 are the invocations
of the samtools program. Programs have input and output ports (the dotted
grey arrows) and ports p1 –p8 are related with default artefacts a0 , b0 , etc. spec-
ified using the provone:hasDefaultParam statement. The artefacts refer to the
code of the executable file and data dependencies; e.g. e0 represents the code of
samtools. Programs are connected to each other via ports and channels, which
in the figure are identified using reversed double arrows.

Fig. 5. A small part of the Genomics pipeline shown in Fig. 1 encoded


in ProvONE. ( ) denotes the hasSubProgram relation; ( ) the has-
DefaultParam statements; ( ) hasInPort/hasOutPort; ( ) the sequence
of the {Pi hasOutPort pm connectsTo Chx , Pj hasInPort pn connectsTo Chx }
statements.

Running this part of the pipeline would generate the runtime provenance
information with the structure resembling the program specification (cf. Fig. 6).
The main difference between the static program model and runtime information
is that during execution all ports transfer some data – either default artefacts
indicated in the program specification, data provided by the user, e.g. input
sample or the output data product. When introducing a change in this context,
wDF wDF
e.g. {b1 −−−→ b0 , e1 −−−→ e0 }, two things are important. Firstly, the usage of
the artefacts is captured at the sub-execution level (SSE 1 , SSE 3 and SE 1 –SE 3 )
while E0 uses these artefacts indirectly. Secondly, to rerun the alignment step
it is useful to consider the sub-executions grouped together under E0 , which
determines the end of processing and delivers data y0 and z0 meaningful for the
user. We can capture both these elements using the tree structure that naturally
fits the hierarchy of executions encoded with ProvONE. We call this tree the
restart tree as it indicates the initial set of executions that need to be rerun. The
tree also provides references to the changed artefacts, which is useful to perform
further steps of the ReComp meta-process. Figure 6 shows in blue the restart
tree generated as a result of change in artefacts b and e.
Provenance Annotation and Analysis 11

Fig. 6. An execution trace for the program shown in Fig. 5 with the restart
tree and artefact references highlighted in blue. ( ) – the wasPartOf relation
between executions; ( ) – the used statements; ( ) – the sequence of the
Ej used z wasGeneratedBy Ei statements. (Color figure online)

Finding the restart tree involves building paths from the executions that used
changed artefacts, all the way up to the top-level execution following the was-
PartOf relation. The tree is formed by merging all paths with the same top-level
execution.

3 Computing the Re-computation Front


Combining together all three parts discussed above, we present in Listing 1.1 the
pseudocode of our algorithm to identify the re-computation front. The input of
the algorithm is the change front CF that the ReComp framework keeps updat-
ing with every change observed. The output is a list of restart trees, each rooted
with the top-level execution. Every node of the tree is a triple: (E, [changedData],
[children]) that combines an execution with optional lists of changed data arte-
facts it used and sub-executions it coordinated. For executions that represent
a simple black-box process the output of the algorithm reduces to the list of
triples like: [(Ei , [ak , al , . . . ], [ ]), (Ej , [am , an , . . . ], [ ]), . . . ] in which the third ele-
ment of each node is always empty. For the example of a hierarchical process
shown above in Fig. 6 the output would be [(E0 , [ ], [(SE 0 , [ ], [(SSE 1 , [b0 ], [ ]),
(SSE 3 , [e0 ], [ ])]), (SE 1 , [e0 ], [ ]), (SE 2 , [e0 ], [ ]), (SE 3 , [e0 ], [ ])])]
The algorithm starts by creating the root node, OutTree, of an imaginary
tree that will combine all independent executions affected by the change front.
Then, it iterates over all artefacts in the ChangeFront set and for each artefact
wDF wDF
it traverses the chain of versions: Item −−−→ PredI −−−→ . . . (line 4). For each
version it looks up all the executions that used particular version of the data
(line 5). The core of the algorithm (lines 6–7) is used to build trees out of
the affected executions. In line 6 a path from the affected execution to its top-
level parent execution is built. Then, the path is merged with the OutTree such
that two paths with the same top-level execution are joined into the same sub-
tree, whereas paths with different root become two different subtrees on the
OutTree.children list.
12 J. Cala and P. Missier

Listing 1.1. An algorithm to find the re-computation front.


1 f u n c t i o n f i n d r e c o m p f r o n t ( ChangeFront ) : T r e e L i s t
2 OutTree := ( r o o t , d a t a := [ ] , c h i l d r e n := [ ] )
3 f o r I t e m i n C h a n g e F r o n t do
4 f o r P r e d I i n t r a v e r s e d e r i v a t i o n s ( I t e m ) do
5 f o r Exec i n i t e r u s e d ( P r e d I ) do
6 Path := p a t h t o r o o t ( P r e d I , Exec )
7 OutTree . m e r g e p a t h ( Path )
8 r e t u r n OutTree . c h i l d e r n
Listing 1.2 shows the path to root function that creates the path from the
given execution to its top-level parent execution. First it checks if the given
execution Exec has already been re-executed (lines 4–6). It does so by iterat-
ing over all wasInformedBy statements in which Exec is the informant check-
ing if the statement is typed as recomp:re-execution. If such statement exists,
path to root returns the empty path to indicate that Exec is not on the front (line
6). Otherwise, if none of the communication statements indicates re-execution
by ReComp, Exec is added to the path (line 7) and algorithm moves one level
up to check the parent execution (line 8). This is repeated until Exec is the top-
level parent in which case get parent(Exec) returns null and the loop ends. Note,
get parent(X) returns execution Y for which statement X wasPartOf Y holds.

Listing 1.2. Function to generate the path from the given execution to its top-level
parent.
1 f u n c t i o n p a t h t o r o o t ( ChangedItem , Exec ) : Path
2 OutPath := [ ChangedItem ]
3 repeat
4 f o r wIB i n i t e r w a s i n f o r m e d b y ( Exec )
5 i f t y p e o f ( wIB ) i s ” recomp : r e −e x e c u t i o n ” then
6 return [ ]
7 OutPath . append ( Exec )
8 Exec := g e t p a r e n t ( Exec )
9 u n t i l Exec = n u l l
10 r e t u r n OutPath

The discussion on other functions used in the proposed algorithm, such as


traverse derivations and iter used, is omitted from the paper as they are simple to
implement. Interested readers can download the complete algorithm written in
Prolog from our GitHub repository.2 Preliminary performance tests showed us
execution times in the order of milliseconds when run on a 250 MB database of
provenance facts for about 56k composite executions and a set of artefact doc-
uments of which two had 15 and 19 version changes. As expected, the response
time was increasing with the growing length of the derivation chain.

2
https://github.com/ReComp-team/IPAW2018.
Provenance Annotation and Analysis 13

4 Related Work
A recent survey by Herschel et al. [9] lists a number of applications of provenance
like improving collaboration, reproducibility and data quality. It does not high-
light, however, the importance of process re-computation which we believe needs
much more attention nowadays. Large, data-intensive and complex analytics
requires effective means to refresh its outcomes while keeping the re-computation
costs under control. This is the goal of the ReComp meta-process [6]. To the best
of our knowledge no prior work addresses this or a similar problem.
Previous research on the use of provenance in re-computation focused on
the final steps of our meta-process: partial or differential re-execution. In [4]
Bavoil et al. optimised re-execution of VisTrails dataflows. Similarly, Altintas
et al. [2] proposed the “smart” rerun of workflows in Kepler. Both consider data
dependencies between workflow tasks such that only the parts of the workflow
affected by a change are rerun. Starflow [3] allowed the structure of a workflow
and subworkflow downstream a change to be discovered using static, dynamic
and user annotations. Ikeda et al. [10] proposed a solution to determine the
fragment of a data-intensive program that needs rerun to refresh stale results.
Also, Lakhani et al. [12] discussed rollback and re-execution of a process.
We note two key differences between the previous and our work. First, we
consider re-computation in the view of a whole population of past executions;
executions that may not even belong to the same data analysis. From the popu-
lation, we select only those which are affected by a change, and for each we find
the restart tree. Second, restart tree is a concise and effective way to represent
the change in the context of a past, possibly complex hierarchical execution. The
tree may be very effectively computed and also used to start partial rerun. And
using the restart tree, partial re-execution does not need to rely on data cache
that may involve high storage costs for data-intensive analyses [15].
Another use of provenance to track changes has been proposed in [8,11] and
recently in [14]. They address the evolution of workflows/scripts, i.e. the changes
in the process structure that affect the outcomes. Their work is complementary
to our view, though. They use provenance to understand what has changed in
the process e.g. to link the execution results together or decide which execution
provides the best results. We, instead, observe changes in the environment and
then react to them by finding the minimal set of executions that require refresh.

5 Discussion and Conclusions

In this paper we have presented a generic approach to use provenance annota-


tions to inform a re-computation framework about the selection of past execu-
tion that require refresh upon a change in their data and software dependencies.
We call this selection the re-computation front. We have presented an effective
algorithm to compute the front, which relies on the information about changes
and annotations of re-executions. The algorithm can handle composite hierar-
chical structure of processes and help maintain the most up-to-date version of
14 J. Cala and P. Missier

the dependencies. Overall, it is a lightweight step leading to the identification of


the scope of changes, i.e. computing difference and estimating the impact of the
changes, and then to partial re-execution.
In line with [1], we note that a generic provenance capture facility which
stores basic information about processes and data is often not enough to sup-
port the needs of applications. For our algorithm to work properly, we have to
additionally annotate every re-execution with the wasInformedBy statement, so
the past executions are not executed again multiple times. This indicates that the
ProvONE model defines only a blueprint with minimal set of meta-information
to be captured which needs to be extended within each application domain.

References
1. Alper, P., Belhajjame, K., Curcin, V., Goble, C.: LabelFlow framework for anno-
tating workflow provenance. Informatics 5(1), 11 (2018)
2. Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the
Kepler scientific workflow system. In: Moreau, L., Foster, I. (eds.) IPAW 2006.
LNCS, vol. 4145, pp. 118–132. Springer, Heidelberg (2006). https://doi.org/10.
1007/11890850 14
3. Angelino, E., Yamins, D., Seltzer, M.: StarFlow: a script-centric data analysis
environment. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds.) IPAW 2010.
LNCS, vol. 6378, pp. 236–250. Springer, Heidelberg (2010). https://doi.org/10.
1007/978-3-642-17819-1 27
4. Bavoil, L., et al.: VisTrails: enabling interactive multiple-view visualizations. In:
VIS 05. IEEE Visualization, 2005, No. Dx, pp. 135–142. IEEE (2005)
5. Cala, J., Marei, E., Xu, Y., Takeda, K., Missier, P.: Scalable and efficient whole-
exome data processing using workflows on the cloud. Future Gener. Comput. Syst.
65, 153–168 (2016)
6. Cala, J., Missier, P.: Selective and recurring re-computation of Big Data analytics
tasks: insights from a Genomics case study. Big Data Res. (2018). https://doi.org/
10.1016/j.bdr.2018.06.001. ISSN 2214-5796
7. Cuevas-Vicenttı́n, V., et al.: ProvONE: A PROV Extension Data Model for Scien-
tific Workflow Provenance (2016)
8. Freire, J., Silva, C.T., Callahan, S.P., Santos, E., Scheidegger, C.E., Vo, H.T.:
Managing rapidly-evolving scientific workflows. In: Proceedings of the 2006 Inter-
national Conference on Provenance and Annotation of Data, pp. 10–18 (2006)
9. Herschel, M., Diestelkämper, R., Ben Lahmar, H.: A survey on provenance: what
for? what form? what from? VLDB J. 26(6), 1–26 (2017)
10. Ikeda, R., Das Sarma, A., Widom, J.: Logical provenance in data-oriented work-
flows. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE),
pp. 877–888. IEEE (2013)
11. Koop, D., Scheidegger, C.E., Freire, J., Silva, C.T.: The provenance of workflow
upgrades. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds.) IPAW 2010.
LNCS, vol. 6378, pp. 2–16. Springer, Heidelberg (2010). https://doi.org/10.1007/
978-3-642-17819-1 2
12. Lakhani, H., Tahir, R., Aqil, A., Zaffar, F., Tariq, D., Gehani, A.: Optimized
rollback and re-computation. In: 2013 46th Hawaii International Conference on
System Sciences, No. I, pp. 4930–4937. IEEE (Jan 2013)
Provenance Annotation and Analysis 15

13. Moreau, L., et al.: PROV-DM: the PROV data model. Technical report, World
Wide Web Consortium (2012)
14. Pimentel, J.F., Murta, L., Braganholo, V., Freire, J.: noWorkflow: a tool for col-
lecting, analyzing, and managing provenance from python scripts. Proc. VLDB
Endow. 10(12), 1841–1844 (2017)
15. Woodman, S., Hiden, H., Watson, P.: Applications of provenance in performance
prediction and data storage optimisation. Future Gener. Comput. Syst. 75, 299–
309 (2017)
Another random document with
no related content on Scribd:
CHAPTER XVII

TIME was spreading its rust and its vines over everything, eating
away the edges of his passions and fastening the hinges of his will
so that it could not turn.
The hate he felt for Chalender was slowly paralyzed. Having
forborne the killing of him lest the public be apprised of what he had
killed him for, it followed that Chalender must be treated politely
before the public for the same reason. Thus justice and etiquette
were both suborned to keep people from wondering and saying,
Why?
Being unable to avoid Chalender, he had to greet him casually, to
pass the time of day, even to smile at Chalender’s flippancies. Under
such custom the grudge itself decayed, or retreated at least to the
place where old heartbreaks and horrors make their lair.
There was much talk of Chalender’s splendid engineering work.
His section of the aqueduct prospered exceedingly. He had a way
with his men and though there was an occasional outburst, he kept
them happier and busier than they were in most of the other
sections.
He had a joke or a picturesque sarcasm for everyone, and the
men were aware that his lightness was not a disguise for cowardice.
They remembered that when two of them had fought with picks, he
had jumped into the ditch between them. He could now walk up to
drunken brutes of far superior bulk and take away their weapons,
and often their tempers. He composed quarrels with a laugh or
leaped in with a quick slash of his fist on the nearest nose.
People said to RoBards: “Fine lad, Harry Chalender, great friend of
yours, isn’t he? Plucky devil, too.”
That was hard to deny without an ugly explanation. It would have
been peculiarly crass to sneer or snarl at a man held in favor for
courage.
So the tradition prospered that Chalender and RoBards were
cronies. It was a splendid mask for the ancient resentment. And by
and by the disguise became the habitual wear, the feelings adapted
themselves to their clothes. He would have felt naked without them.
RoBards had to shake himself now and then to remind himself that
he was growing not only tolerant of Chalender, but fond of him.
This was not entirely satisfactory to Patty. She had a woman’s
terrified love of conflict in her behalf. A woman who sees a man slain
on her account suffers beyond doubt, but there is a glory in her
martyrdom. Patty’s intrigue had ended in a disgusting armistice, a
smirking truce. It was comfortable to have a husband and a home,
but it was ignominious to have the husband at peace with the
intruder.

The aqueduct was all the while growing, a vast cubical stone
serpent increasing bone by bone and scale by scale.
It still lacked a head, and RoBards the lawyer like a tiny Siegfried
continued to assail the dragon everywhere, seeking a mortal spot.
The Croton dam was yet to be built, as well as two big bridges and
two great reservoirs in the city. It grew plain that the seven miles
within the island of Manhattan would cost nearly as much as the
original estimates for the whole forty-six.
And the times were cruelly hard. The estimates rose as the
difficulty of raising money increased. Four and a half million dollars
were disbursed without the error of a cent, and the devotion and
dogged heroism of all the water army won even RoBards’
admiration.
By the beginning of 1841 thirty-two miles were finished, including
Harry Chalender’s section. He was called next to aid the work of
completing the dam. A new lake now submerged four hundred acres
of hills and vales with a smooth sheet of water.
Then the laborers on the upper line struck for higher wages and
marched down the aqueduct, driving away or gathering into their
own ranks all the workmen they met. They overawed the rural police,
but when the Mayor of New York called out the militia, the laborers
were forced back to their jobs.
The building of the dam was a work of titanic nicety. The rock
bottom of gneiss was so far down that an artificial foundation had to
be laid under a part of the wall, while a long tunnel and a gateway
must be cut through living rock. A protection wall was building from a
rock abutment, but there came a vast rain on the fifth of January and
it fell upon the deep snow for two days and nights. The overfall had
been raised to withstand a rise of six feet, but the flood came surging
up a foot an hour until it lifted a sea fifteen feet above the apron of
the dam.
Foreseeing a devastation to come, a young man named Albert
Brayton played the Paul Revere and ran with the alarm until he was
checked by a gulf where Tompkins Bridge had stood a while before.
Then he got a horn and played the Angel Gabriel: blew a mighty
blast to warn the sleeping folk on the other shore that their Judgment
Day had come.
The earthen embankment of the dam dissolved and took the
heavy stone work with it. Just before dawn the uproar of the torrent
wakened the farmers miles away as the catapult of water hurtled
down the river, sweeping with it barns, stables, homes, grist mills,
cattle, people, and every bridge across the Croton’s whole length, till
it flung them upon the Hudson’s icy waste.
The Quaker Bridge, which carried the Albany stages, went
swirling; also the Pines Bridge that Washington and his men had
traversed time and again. At Bailey’s iron and wire mills the snarling
wave fell so swiftly upon the settlement that it made driftwood of the
factory and flung fifty women and men from their beds into the
current. There was such a fleet of uprooted trees afloat that all of the
people were saved except two stout men who overweighted the
boughs they clung to. A Mr. Bailey waded breast deep carrying his
father and a box of gold in his arms and got them both to safety.
Harry Chalender played the hero as usual. After one laborer on
the dam had lost his outstretched hand and was drowned, he ran
along the black waters and darting in here and there brought forth
whatever his hand found, whether girl or babe, lowing calf or
squeaking pig. He brought one swirling bull in by the tail and had like
to have been gored to death for his courtesy. But with his wonted
nimbleness he stepped aside, and the bull charging past him
plunged into another arm of the stream and went sailing down with
all fours in air.
The collapse of the dam was a grave shock to the public
confidence. It meant a heavy loss in precious cash and its time
equivalent, but the Crotonians grew only a little grimmer, a little more
determined.
There was much blazon of Chalender in the newspapers, and a
paragraph describing how meek he was about the strength and
courage of his own hands and how proud of the fact that his section
at Sing Sing had stood the battering rams of the deluge without a
quiver.
Patty’s comment on this was a domestic sniff: “I suppose he got
his feet so wet he’ll catch a terrible cold. Well, I hope he doesn’t
come here to be nursed. If he should I’ll send him packing mighty
quick, I’ll tell you.”
Comment was difficult for RoBards, to whom the mention of
Chalender’s mere name was the twisting of a rusty nail in his heart,
but his mind leaped with a wonderful meditation:
There had been progress not only in the building of the aqueduct
but in the laying of a solid causeway under the feet of his family. A
sudden storm had swept Patty’s emotions over the dam of restraint
and wrecked their lives for a while, but now the damage was so well
repaired that she could speak with light contempt of the man who
had carried her heart away; she could say that she would shut in his
face the door to the home he had all but destroyed. Plainly the house
was now her home, too, and Chalender vagrant outside.
This thought filled RoBards’ heart with a flood of overbrimming
tenderness for Patty. He watched her when she tossed the
newspaper to the floor and caught her more exciting baby from its
cradle to her breast. She laughed and nuzzled the child and crushed
him to her heart and made up barbaric new words to call him. Calling
him Davie Junior and little Davikins was in itself a way of making
love to her husband by the proxy of their child.
The sunlight that made a shimmering aureole about her flashed in
her eyes shining with the tears of rapture. RoBards understood one
thing at last about her: She wanted someone to caress and to
defend.
He had always read her wrong. He had offered to be her
champion and to shelter her under his strong arms. But Chalender
had won her by being hungry for her and by stretching his arms
upward to drag her down to him.
RoBards felt that he had never really won Patty because he had
always been trying to be lofty and noble. She had rushed to him
always when he was dejected or helpless with anger; but he had
always lost her as soon as he recovered his self-control.
He wished that he might learn to play the weakling before her to
keep her busy about him. But he could not act so uncongenial a part
at home or abroad.
CHAPTER XVIII

AFTER years of waiting and wrangling, labor conflicts, lawsuits,


political battles, technical wars, and unrelenting financial difficulties
and desperate expedients, through years of universal bankruptcy,
the homely name of the Croton River acquired an almost Messianic
significance in the popular heart.
There was already a nymph “Crotona” added to the city’s
mythology. The thirsty citizens prayed her to hasten to their rescue
from the peril of another fire, another plague, the eternal nuisance of
going for water or going without.
Other history seemed of less importance, though tremendous
revolutions had been effected in the democracy. The property
qualification had been at last removed and the terrible risk assumed
of letting all men vote without regard to their bank accounts. The
religious requirements for office holders had also been annulled in all
the states. There had been fierce riots, of course, but the promised
anarchy had not followed. This gave a new boldness to the annoying
fanatics who asked for three downright impossibilities: the abolition
of slavery and of liquor, and the granting of equal rights to women.
Numbers of shameless females broke into public life and some of
them into breeches. Mobs of conservatives raided their meetings,
and chased them hither and yon; but still they raved and several
effeminate or half-crazed men openly preached against slavery in
the South. The bulk of the clergy of all denominations was, of
course, against them.
THE SUNLIGHT THAT MADE A SHIMMERING
AUREOLE ABOUT HER FLASHED IN HER EYES,
SHINING WITH THE TEARS OF RAPTURE

The Marquis of Waterford had made himself notorious with his


riotous gayety and his clashes with the night watchmen, the
Leatherheads. A fifty-year-old veteran of Waterloo had married a
sixteen-year-old heiress in a boarding school secretly and had
received enormous attention from the newspapers.
Fanny Elssler had danced herself into the favor of the people and
the horror of the pulpit. Daniel Webster had thundered for the Whigs.
The streets had roared with the campaign cry of “Tippecanoe and
Tyler, too.” Hard cider had become a slogan and log cabins a
symbol. A log cabin had been built at Harrison’s, a few miles from
Tuliptree Farm. It served later as a schoolhouse. Then President
Harrison died of indigestion a month after his inauguration.
The hard times grew harder and harder. The inpour of foreign
immigrants increased till New York became almost a foreign city. The
Native-Americans anxiously formed a party and their nominee
received all of seventy-seven votes; he was a painter named S. F. B.
Morse who had invented a curious toy he called the telegraph. He
wanted Congress to help him stretch a wire from Washington to
Baltimore for him to play with.
The churches started an hegira uptown. One of them was set out
as far as Tenth Street on Fifth Avenue, which had recently been
opened through the farms beyond Washington Square. A mission
had been established in the foreign world of the Five Points, where it
amused the populace of the brothels and crime cellars.
Crime increased and flourished appallingly and the newspapers
were unfit for the home. The murderer Colt, having cut up the body
of his victim, salted it, and shipped it to New Orleans; was caught,
tried, and convicted; then, having married a foolish woman in his cell,
stabbed himself to death and died while the guards of the Tombs
fought a fire of mysterious origin.
The “beautiful cigar girl” furnished another mystery and an excuse
for revolting journalism. RoBards had bought tobacco of her during
his exile in town and had watched with sardonic disdain the wily
smiles she passed across the counter to her customers who came
more for flirtation than for weeds. One day she vanished and after a
time her body was found drifting in the river near the Sibyl’s Cave in
the beautiful Elysian Fields at Hoboken. She had evidently fought a
desperate battle with her murderer, but had been flung bruised and
beaten into the water. Her murderer was never discovered. People
said he was a naval officer, but they could not prove it.
One of the cheap and popular newspaper men named Edgar Allan
Poe made an ephemeral mystery story out of it. It was exciting but,
of course, not literature. His name was never included in the list of
dignified authors whom the defenders of American art compiled to
prove to the English critics that good writing was possible on this
side of the Atlantic.
Dr. Lardner came over from England and proved conclusively that
steam was impracticable for crossing the ocean. Shortly afterward a
steamer brought across the popular English serial writer, Charles
Dickens, and the people lavished on him attentions which he
rewarded with infuriating contempt. Captain Marryat and other
Englishmen, and women like Mrs. Trollope, began a book
bombardment against the pride of the new republic, and roused it to
fury.
But all the while the city panted like a hart for its Croton water
brooks, and the engineers redoubled their efforts. They decided not
to wait for the High Bridge and improvised a temporary passage
across and under the Harlem River. The hope was revived that water
would come into the city on Independence Day.
Swarms of masons toiled at the two reservoirs until they stood at
last waiting, like vast empty bowls held up to heaven for a new
Deluge. The flood was to be received at the Yorkville reservoir,
carried on by iron pipes to Murray’s Hill, and distributed thence by
pipes about the city, with a special dispensation to the old well and
tank that had been erected in 1829 at Thirteenth Street to feed the
hydrants that replaced the foul old public cisterns.
Everywhere the streets and the houses were torn to pieces, pipes
were laid in all directions and fountains built. The plumber was the
hero of the hour.
The test of fashion was a faucet in the kitchen.
On a hot day in June the Water Commissioners and the engineers,
including Harry Chalender, began a strange pilgrimage through the
thirty-three miles of tunnel, for a last anxious inspection. It took them
three days to make the patrol on foot.
The vents along the way for the escape of water from deep
cuttings and leakages were closed once for all. And on the twenty-
second of June the Croton River began its march upon New York. At
five o’clock in the morning the head of the stream was admitted and
on the primal tide, some eighteen inches deep, a boat was launched.
The Croton Maid weighed anchor to descend upon New York with
the “navigable river” from the north.
Harry Chalender made one of the four passengers on that
“singular voyage” through the great pipe at the rate of a little better
than a mile an hour. The “Maid” came up for air at the Harlem River
the next day, a Thursday, soon after the first ripple of the water laved
the borders of Manhattan Island.
The Commissioners formally notified the Mayor and Common
Council that the Croton River had arrived and would proceed after a
brief rest to Yorkville Reservoir.
On Monday afternoon the Governor of the State, the Lieutenant
Governor, the Mayor, and other distinguished guests drew up in
solemn array and greeted the “extinguishing visitor,” while the
artillery fired a salute of thirty-eight guns.
When the Croton Maid sailed into the reservoir she was made
grandly welcome and then presented to the Fire Department, with
appropriate remarks on the “important results pecuniary and moral
which may be expected to flow from the abundance of the water with
which our citizens are hereafter to be supplied.”
On the Fourth of July Queen Crotona resumed her royal progress
and proceeded the necessary parasangs to Murray’s Hill, pausing to
fire the salute of a beautiful jet of water fifty feet in air at Forty-
seventh Street.
It was noted by one of the observers that when the waters of the
Croton gushed up into the reservoir they “wandered about its bottom
as if to examine the magnificent structure or to find a resting place in
the temple toward which they had made a pilgrimage.” That river
was as much of a god to the New Yorkers as old Tiber ever was to
Rome, or Nilus to Egypt.
But thereafter the stream, like another conquered Andromache,
became the servant of New York, pouring into its thirsty throat twelve
million imperial gallons of pure water every day.
The people congratulated themselves upon this achievement of
their city single-handed in a time of national financial prostration. In
the memoir written by the chief engineer, J. B. Jervis, he proudly
compared the new aqueduct with the great works of Rome, built
under contracts with private speculators, paid for with the plunder of
ruined peoples, and “cemented with the blood of slavery.” The
Croton work was a triumph of a city of 280,000 inhabitants, who
wrought a task, said Jervis, “on a scale greatly beyond their actual or
any near future wants, but which, designed to endure for ages,
would bear record to those ages, however distant, of a race of men
who were content to incur present burdens for the benefit of a
posterity they could not know. Magnificent as may be the works of
conquerors and kings, they have not equaled in forecast of design,
and beneficence of result, the noble aqueduct, constructed at their
own cost, by the freemen of the single city of New York.”
Much eloquence, much of the bold and braggart Yankee
eloquence so distasteful to foreigners and to foreign-hearted
Americans, was squandered on that feat of theirs; but before they
talked, they had toiled; they sweat before they boasted; they fought
the epic before they chanted it; and their words were not so big as
the stones they heaved into place. Their phrases were less
ponderous than the majestic forty-six-mile sentence in stone they
wrote across the green valley of the Westchester hills, through rock
and air, over hills and ravines, through villages and streams, across
the Harlem River and down into the heart of Manhattan Island.
But the massive High Bridge was yet to build and the Croton had
yet to reach the lower fountains and the homes of the citizens. They
had waited long for it, and it meant miraculous relief to have the river
from far away magically bubbling in the very houses at the wizard
twist of a faucet handle, and sending up geysers of beauty in the hot
parks. Many of the New Yorkers who marveled told how they had in
their day paid a penny a gallon for water from the carts that peddled
the product of the “Tea Water Pump.” Even David and Patty RoBards
could remember when they fled the town and thought it doomed to
die of drouth and pestilence.
The city felt that this immortal benison must be commemorated
fittingly. When the New River had entered London the Lord Mayor
had addressed it in his full splendor. When the waters of Lake Erie
had come through the canal to New York they had been married to
those of the ocean with grandiose ceremonial.
So now the Board of Aldermen appointed a committee, and the
committee called upon General George P. Morris to write an original
ode and the Sacred Music Society to sing it. “The Society’s vocal
performers were rising two hundred, male and female.” The bells of
the churches were bidden to ring; the artillery to fire salutes. All the
distinguished personages on the continent were invited to attend and
witness the most resplendent procession ever devised.
The date was set for the fourteenth of October and the citizens
devoted themselves to the preparation of banners, uniforms, and
maneuvers, and the polishing of fire-engines, swords, shoes, and
phrases.
An invitation was addressed to the President of the United States,
but Mr. Tyler was prevented by “circumstances”; and the ex-
President John Adams by “indispensable engagements at home.”
Ex-President Van Buren found it not in his power to avail himself of
the polite invitation. Governor Seward had “a severe indisposition,”
but accepted. The British Consul accepted “with feelings of no
ordinary kind,” and remarked that “tyrants have left monuments
which call forth admiration, but no work of a free people for
magnitude and utility equals this great enterprise.” The Consul of
France presented his compliments and would be happy to join with
them. The Consul of Prussia had much pleasure in accepting. The
Consul of the Netherlands had the honor of joining. The Consul for
Greece and Count Heckscher, the Consul for Mecklenburg,
regretted, but the Consuls of the Two Sicilies, the Grand Duchy of
Hesse, Frankfort, and Venezuela accepted. The Consul of Mexico
was prevented by absence, and the Consul of Texas, the recent
republic of Texas, feared that his “engagements of the day would
deprive him of the pleasure.”
Officers of the navy, the army, the bench, governors, mayors,
engineers, bankers, and others innumerable accepted or declined.
The common people prepared to turn out in a body.
The enthusiasm was so pervasive, that even the children felt the
thrill of the epochal day. The RoBards youngsters, little Keith and his
sister Immy, were feverish. The very baby at Patty’s breast seemed
to beat the air and crow like chanticleer at the mention of the
Fourteenth of October.
The one sure bribe for good behavior was a promise to go to New
York for the parade; the one effective punishment a threat of being
left at home.
Hardly an account of the aqueduct or the festival omitted
Chalender’s name, and RoBards grew so accustomed to it that he all
but forgot the horror it had once involved.
He was himself infected by the glory of the hour. It was like seeing
one of the Pyramids dedicated, or the Sphinx christened.
Time that makes us grateful for our defeats and turns our victories
to chagrin dealt so with RoBards. Though he had hampered the work
and denounced its trespass on the rights of the landholders, he felt
glad now that he and they had been defeated. Chalender was
gracious in his triumph, and felt all the more genial since the victory
had been enhanced by the high mettle of the opponents.
So everybody was happy and proud, and the aqueduct itself took
on something of the sanctity of a long, long temple, a source of
health and security and of unbounded future growth.
RoBards spoke of this to Patty and said that the names of the men
who had fought this long battle through would be immortal.
“Who are they?” she said with a disconcerting abruptness.
And to save him he could not think of them, though he knew the
names of many picturesque criminals, and of persons whose only
importance was some fashionable prestige. He knew the names of
many who had pounded out a little poem or braided a piece of clever
fiction. He knew the names of manufacturers of popular soaps and
razor strops, but he could not recall the giants who had wrested the
rocks from the hills and laid down the new channel for the river that
would redeem the chief city of the continent.
He had to refer to the memoir of the Commissioners and to read
aloud the passage: “Samuel Stevens, Esq., was the presiding officer
of the Board of Commissioners in 1829, whose name and services
will be recorded with those of Stephen Allen, and Douglas and
Jervis, for the enduring gratitude of the distant generations, whose
health, comfort, and safety will, ‘while grass grows and water runs,’
continue to be promoted by the great work to which these gentlemen
devoted such faithful and intelligent care.”
Patty nodded: “Well, I’m sure I’m much obliged to them for making
New York safe to live in. We can go back now, can’t we?”
“Isn’t it beautiful up here?” he sighed, without much enthusiasm.
“Yes, but the nights are bitter cold and the days are getting raw,
and the leaves are nearly all gone. I’ve been here for years and the
children have had all the diseases there are and got over them.
They’re out of danger. Let’s go back, David.”
When she called him by his first name it was like taking his heart
in her soft fingers. He had no will to resist. Besides, the house had
lost its integrity. It had played him false. It had permitted evil to
prosper, and he had sacrificed his dignity and his revenge to conceal
its shame.
Nothing worse could happen in the big city than in the stealthy
country. So he sighed again:
“All right! let’s go back!”
She sprang from her chair and kissed him and he took a poltroon
delight in the syrup of her lips. She became amazingly a girl again
and assailed with a frenzy the tasks of packing up for the removal to
town, the closing of the country home, and reopening of the house in
St. John’s Park.
She urged that she and Teen and Cuff should drive in and clean
the house, air it out, get the new water pipes put in and—while they
were at it, why not install gas? It was dangerous but so convenient!
All you did was turn a key and set a match and there you were! And
what about one of the new hot air furnaces to replace the odious
stoves and fireplaces?
She laid plans for such fairy improvements with a spendthrift
enthusiasm and proposed that her husband should stay comfortably
at home in the country with the two older children while she made
the house ready.
She was passionately domestic for the first time and when she
offered as a final inducement to take her father and mother to town
with her, RoBards could not deny her the toil or himself the repose.
He wanted a few days of communion with the ideal he was
resigning. He wanted to compose his soul anew for the new city life,
the country good-by.
The children, Immy and Keith, made a great to-do about their
mother’s knees, clinging to her and begging her not to go. And the
babe-in-arms, the miniature David, howled in trio, vaguely
understanding that something ominous was afoot. Patty was the
center of the battle. She held the infant under one arm while with her
free hand she tried to clasp both Immy and Keith. Her voice was soft
among the clamors, and she promised them everything if they would
only be good for a few days while she made the home ready in the
great city.
She looked up at her husband and he could see the weird pride in
her eyes. She, the frail, the pretty, the soulful, had been as an apple-
branch that bore these buds to flower and fruit from within herself
somehow. And they hated to let go, as perhaps the apple is reluctant
to be tossed into space by the wind that rends the twig.
RoBards had noted this cohesion in trees that were hard to fell
and split. Some woods would almost welcome the teeth of the saw
and the keen edge of the ax; they divided at a tap. But other trees
fought the blade, twisted it and flung it off and made a strange noise
of distress. And when the ax fell upon them they turned it aside,
caught it in withes of fiber and tore it from the helve.
Families were like that: some broke apart at the first shock; others
clung together as if they were all interlaced, soul and sinew. He
hoped that his household would be of this infrangibility.
Patty diverted the children from their grief by loading them with
tasks and warnings; the first was to take good care of Papa; the rest
were to take care of themselves amid the infinite risks that make a
jungle about children.
She murmured to her husband: “Watch out for those Lasher
children. That boy Jud has grown to a big hulking brute. He hangs
about the place—wants to steal something. I suppose. Drive him off
if you see him. And don’t let the children play with the Lashers. They
come by in the road, and they’re—not nice at all.”
She made the children promise to abstain from friendship with the
Lashers and from numberless other adventures; and at last she
broke from them and hurried to the carry-all. Cuff and Teen had gone
ahead in the wagon with the luggage. RoBards helped Patty and the
baby to the front seat and took his place beside her. Her father and
mother were already bestowed in the back of the carriage. RoBards
drove away, calling to the children that he would soon be home.
He and Patty had little to say of either their secret prides or
shames; old age had its eyes upon their shoulder blades, and was
perhaps subtly understanding from the glum wisdom of experience
that this young couple was gathering also much cargo that could
never be thrown overboard and must always be hidden away in the
deepest hold.
The length of the journey to New York was wonderfully shortened
now. RoBards put Patty and her parents and the servants on the
stage and she had only to ride as far as Harlem, where she would
take the New York and Harlem Railroad train. It had a steam engine
and a double track clear to City Hall, and some day it was going to
be extended to White Plains, and eventually perhaps to Chatham.
When he had seen the stagecoach whirl off with Patty and had
seen her handkerchief flaunt its last farewell through the dust,
RoBards drove home.
Or was it home now? Home seemed to be a something cloudlike
trailing after his wife. Home was the immediate neighborhood of his
love.
His heart ached with anxiety for her. What if she should not arrive
safely? The number of stagecoach accidents was astounding:
drunken drivers, runaway horses, capsizings, collisions, kept up an
endless succession of deaths and cripplings.
Thinking of Patty as perhaps doomed already he thought of her
with overwhelming tenderness. The very road backward was
denuded of the aureole she lent it. It stretched dour and stark in the
harsh outlines of autumn. The trees were stripped of leaves; the
lanes of their soft borders. Everything was naked and harsh. The
wind was ugly, cynical; it tormented the flocks of fallen leaves, sent
them into panics of flight with hoarse little cries and scurries.
This was no place for a rose like Patty.
He rode past the home of the Lashers. It was always autumn
there. However, the wild flowers of spring held picnics in the lanes
and the weeds put on their Sunday calico; however the peach trees
and the plums and cherries in their disordered companies broke forth
into hosannas of bloom and pelted the yard and the house with petal
confetti, this house and this fence always sagged and creaked; the
shutters hung and flapped in the breeze; the family slumped,
eternally exhausted from the sheer neglect of industry.
None of the men was to be seen to-day; though the mother of the
family, as always, hung over the washtub, bobbing up and down like
a Judy on a string. She alone toiled, while the good-for-naught men
dawdled and leered. They were as vicious as the filthy dogs that ran
from the yard now and hurled themselves yelping at RoBards’
horses, trying to nip them while dodging their hooves. RoBards
drove them off with whip and yell and the horses bolted.
As he approached his own house at length, still fuming with anger
at the Lashers and their dogs, he saw his boy running toward him
along the road. He was shrieking: “Papa! papa! papa!”
When Keith came up alongside the carry-all, he was gulping for
breath, in such pain of fear and suffocation that he had to lean
against the wheel a moment before he could speak.
But his trembling hands pointed and his eyes were wild with fear
as he gasped:
“Papa!—bad man!—Immy!”
“What? where? when?”
“Just now—me and Immy play in the Tarn—big man comes—says
to Immy—‘Hello, little girl!’ She don’t say anything. He comes up
closeter. He reaches out. She cries—runs—he runs—grabs Immy. I
run and pound him with my fists and he won’t let go. He kicked me
into the Tarn. Yes, he did so! Then he runs away with Immy.
“Who was it, do you know?”
“Jud Lasher.”
RoBards gave his horse a swift long slash with the whip and the
carry-all went into the yard on two wheels. He flung the lines on the
horses’ backs and, leaping across the wheel, ran madly past the
house and up the shaggy hillside toward the place that he and Patty
called “the Mystic Tarn.”
The boy followed, stumbling, holding his hand to his side where
the little heart thumped. His young eyes were aghast with the awe of
a terror beyond his ken.
CHAPTER XIX

BACK of the house and above it on a hilltop too rocky for clearing,
too rough for pasture even, was a little pool ringed around with huge
boulders. No one could explain them, though the Indians had
believed that they had been hurled in a battle of giants.
Tall trees stood up among them and canopied the pool with such
shadow that on the hottest days there was a chill there.
RoBards had brought Patty hither on their first visit to Tuliptree
Farm as bride and groom fugitive from the cholera plague. She had
cried out in delight at the spookiness of the place and he had called
it the Tarn of Mystery. He was not quite sure what a tarn might be but
the word had a somber color that he liked. And Patty had shuddered
deliciously, rounding her eyes and her lips with a murmurous “ooh!”
like a girl hearing a ghost story late at night.
He had helped her to skip from rock to rock like an Alpine climber
among glaciers, but when they came close to the pool glowing as an
emerald of unimaginable weight, she had recoiled from it in disgust,
because it seemed to her but a sheet of green scum. He explained
to her that what revolted her was an almost solid field of drenched
tiny leaves. But he could not persuade her to come near and admire.
She hated the look of it, and when she saw a tiny water snake
wriggling through it in pursuit of a frog, she fled in loathing.
In the fall the leaves came down from the trees in slow spirals.
They lay on the surface of the pool, which had not water enough to
draw them into its plant-choked shallows. The sharpening winds
swept them across the surface in little flocks.
The children loved to play beside the Tarn, though Patty told them
stories of Indians that had murdered and been murdered there. She
whispered to RoBards that when she saw the Tarn it always hinted of
suicide or assassination. The farmer, Mr. Albeson, laughed at this,
but his wife, Abby—even the children called her Abby—said they
was stories about the place. She had forgotten just what they was,
but like as not they was dead bodies there. Folks enough had
vanished during the Revolution, and maybe some of them was still
laying out there waiting for Judgment Day to rouse them up.
It was to this moody retreat that RoBards hurried now. He took one
rail fence at a leap and landed running, like a hurdler. He stumbled
and fell and was up again. Keith clambered after his father, crawled
through the fence and over the rocks till he came where Immy lay
bruised and stunned. Keith saw his father drop to his knees and lift
the child, clench her to his breast, and shake his head over her, then
raise his eyes to the sky and say something to God that the boy
could not hear.
The boy had always been reproached for tears and had been told,
“You’re a big man now and big men don’t cry.” Yet he could see that
his father was crying, crying like a little frightened girl. This strange
thing twisted the boy’s heart and his features and he pushed forward
to comfort his father. He was near enough to hear his sister
moaning:
“Papa—papa—I’m hurt—Immy’s hurt!”
Before the boy could touch him, RoBards lowered Immy gently in
the autumn leaves and put up his head and let out a strange sound
like a wolf’s howl.
Then he struggled to his feet, and ran here and there, looking,
looking. He climbed one of the high boulders about the Tarn and
stared this way and that; leaped down and vanished.
Keith ran past Immy whimpering and struggled up the steep slab
of the same boulder on all fours. Before he reached the top he could
hear voices, his father’s in horrible anger, and another voice in terror.
It was Jud Lasher’s voice and there was so much fear in it that
Keith’s own heart froze.
Sprawling at the peak of the boulder, he peered over, and there he
saw his father beating and kicking and hurling Jud Lasher about on
the sharp stones. He swung his fist like the scythe the farmer swung
and slashed Jud’s head and swept him to the ground; then picked

You might also like