Skip to content

Instantly share code, notes, and snippets.

@pdurbin
Last active August 29, 2015 14:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pdurbin/a4a4678ab2a86941b773 to your computer and use it in GitHub Desktop.
Save pdurbin/a4a4678ab2a86941b773 to your computer and use it in GitHub Desktop.
2015-06-09 Towards a Common Deposit API (the Dataverse example)
<!DOCTYPE html>
<html>
<head>
<title>2015-06-09 Towards a Common Deposit API (the Dataverse example)</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<style type="text/css">
@import url(http://fonts.googleapis.com/css?family=Yanone+Kaffeesatz);
@import url(http://fonts.googleapis.com/css?family=Droid+Serif:400,700,400italic);
@import url(http://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic);
body { font-family: 'Droid Serif'; }
h1, h2, h3 {
font-family: 'Yanone Kaffeesatz';
font-weight: normal;
}
.remark-code, .remark-inline-code { font-family: 'Ubuntu Mono'; }
.remark-slide-content { font-size: 25px; }
.title h1 { font-size: 70px; }
.agenda { font-size: 26px; }
.footnote {
font-size: 15px;
position: absolute;
bottom: 1em;
left: 3em;
right: 1em;
}
.quote { font-size: 18px; }
.bigbullets {
font-size: 2em;
}
.smaller {
font-size: 0.7em;
}
.bigtext {
font-size: 1.5em;
}
</style>
</head>
<body>
<textarea id="source">
class: bottom, title
# Towards a Common Deposit API (the Dataverse example)
.center[<img src="dataverse-logo.png" width="400px"/>]
.smaller[
<img src="elizabeth-quigley.png" width="60" align="left" style="border:5px solid transparent"/>
Elizabeth Quigley
http://www.iq.harvard.edu/people/elizabeth-quigley
<br/>
<br/>
<br/>
<img src="philip-durbin.jpg" width="60" align="left" style="border:5px solid transparent"/>
Philip Durbin ([@philipdurbin](https://twitter.com/philipdurbin))
http://www.iq.harvard.edu/people/philip-durbin
http://greptilian.com
]
.center[.smaller[2015-06-09]]
---
# Data + Article Depositing Now
- Many repositories: Figshare, Dataverse, Zenodo, Dryad, etc.
- Many publishers: PLOS, F1000 Research, Data in Brief (Elsevier), Palgrave Communications (Palgrave Macmillan / Nature), etc.
- Ways to do this: no automation and automated automation
---
# No automation
Author and Journal have to work in two different places:
- the repository
- the journal publishing system
???
The non-automated experience can be compared to working in different silos that don't speak to each other and require the author and journal to have to go to two different places to manually manage, update and review the material:
a) the journal article in a journal publishing system
b) the dataset associated with the journal article in a repository
---
# Automated experience
Streamlined workflow so authors and journals don't have to work in two different places.
???
Streamlines the journal publishing workflow with the data publishing workflow so that they don't have to go to two different places to manage these two workflows.
---
# Workflows
.center[
<img src="workflows.png" width="600px"/>
]
.smaller[
http://datascience.iq.harvard.edu/presentations/data-publishing-workflows-dataverse-0
]
---
# Fully Automatic: OJS + DVN 3.6 (2013)
.center[<img src="sloan-pkp.png" width="300px"/>]
<img src="ojs-dataverse-arrows.png" width="700px"/>
SWORD supported shipped in DVN 3.6 in late 2013 but has been carried forward into Dataverse 4.0.
.smaller[http://projects.iq.harvard.edu/ojs-dvn https://youtu.be/ftK1_IvWaVc]
---
# Introducing SWORD
.center[
<img src="sword-logo.jpg" width="300px"/>
<br/>
Simple Web-service Offering Repository Deposit (SWORD)
]
- [http://en.wikipedia.org/wiki/SWORD_(protocol)](http://en.wikipedia.org/wiki/SWORD_%28protocol%29)
- a "profile" of AtomPub ([RFC 5023](https://tools.ietf.org/html/rfc5023)) from Google
- http://swordapp.org
- https://github.com/swordapp
- https://twitter.com/swordapp
---
# SWORD: The Good Parts
- well defined standard, based on other standards (AtomPub)
- http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html
- good support for publishing workflows
- "collections" as containers (dataverses, etc.)
- "In-Progress" HTTP header for "unpublished"
- negotiation of packaging (SimpleZip, METS, BagIt)
- deposit receipt
- popular in scholarly publishing
- http://swordapp.org/sword-v2/sword-v2-implementations/
- server library in Java (used by Dataverse)
- client libraries in Java, PHP, Python, and Ruby
---
# SWORD: More Good Parts!
- Open Science Framework (OSF) integration
- https://osf.io/getting-started/#dataverse
- Python package for Dataverse APIs
- https://github.com/IQSS/dataverse-client-python
- R package for Dataverse APIs
- https://github.com/ropensci/dvn
- (Future) Archivematica integration
- http://www.rdc-drc.ca/the-rdc-federated-pilot-for-data-ingest-and-preservation/
- (Future) TOP SECRET integrations! :)
---
# Challenges with SWORD: Metadata
- lowest common denominator for metadata: dcterms
- http://dublincore.org/documents/dcmi-terms/
- 15 properties in the `/elements/1.1/` namespace
- contributor, coverage, **creator**, **date**, **description**, format, identifier, language, **publisher**, relation, **rights**, source, **subject**, **title**, type
- 55 properties in the `/terms/` namespace
- isReferencedBy et al.
- 154 metadata fields in **base install** of Dataverse 4.0
- http://guides.dataverse.org/en/4.0/user/appendix.html
.smaller[\[sword-app-tech\] client SHOULD add Dublin Core terms to the Atom Entry, MAY add any other metadata formats or foreign markup - http://www.mail-archive.com/sword-app-tech@lists.sourceforge.net/msg00384.html]
---
# Challenges with SWORD: Dev Activity
.left[GitHub]
<img src="sword-github.png" width="300px"/>
<img src="sword-rss.png" width="300px"/>
.smaller[
"**I'm hoping, also, that there will be more funding** at this end at some point for some more 'core' development, but opportunities are currently vague, so I don't think there's any point in holding out for it. Instead it seems much better to try to increase community engagement with the code and try to sustain it that way."
-- Richard Jones, SWORD spec lead, in the "code governance" thread on sword-app-tech, July 2014
http://www.mail-archive.com/sword-app-tech@lists.sourceforge.net/msg00400.html
]
---
# Both SWORD and a "Native" API
.center[<img src="dataverse-logo.png" width="400px"/>]
- Dataverse is committed to supporting SWORD
- Dataverse 4.0 added a new JSON-based "native" API
- all metadata fields supported
- does more than Data Deposit
- Search
- Permissions
- etc.
http://guides.dataverse.org/en/latest/api
---
# A Common Deposit API?
<br/>
.bigtext[Should we all work towards using SWORD or are there other options?]
<br/>
## How to Get Involved
SWORD mailing list:
.smaller[http://swordapp.org/contact/]
Dataverse API Community Group:
.smaller[http://community.dataverse.org/community-groups/api.html]
</textarea>
<script src="http://gnab.github.io/remark/downloads/remark-latest.min.js" type="text/javascript">
</script>
<script type="text/javascript">
var slideshow = remark.create();
</script>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment