Skip to content

Instantly share code, notes, and snippets.

@drjwbaker
Last active March 15, 2019 16:54
Show Gist options
  • Save drjwbaker/12ad2621f0933716fc25792b6f04bb8a to your computer and use it in GitHub Desktop.
Save drjwbaker/12ad2621f0933716fc25792b6f04bb8a to your computer and use it in GitHub Desktop.
Born-Digital Archives and Digital Forensics – Where are We Now?, University of London, 15 March 2019

Born-Digital Archives and Digital Forensics – Where are We Now?, University of London, 15 March 2019

Live notes, so an incomplete, partial record of what actually happened. Also, some material presented at this event was not intended to be shared, so notes may be patchy

Tags: digiforensics19

Site: https://www.sas.ac.uk/events/event/19289

My asides in {}

Tweet embeds are things I liked that seemed relevant to include or that captured things I missed


10:00 - 11:00

  • Thorsten Ries: Born-Digital Archives and Digital Forensics: Where Are We Now?

Studying the institution that preserved the record.

.@RiesThorsten kicks off the morning session by asking - what footprint does the archivist leave while undertaking the forensic/preservation process? #digiforensics19

— Rachel MacGregor (@An_Old_Hand) March 15, 2019

Small archives have struggled with the standards set by the big, well-funded institutions .. standards of proof being developed in law enforcement - e.g. AI proofs of illegal intrusion - are different categories of standards of truth as we seen in the humanities

.@RiesThorsten points out that not only hardware and software obsolescence but also forensic tools obsolescence is a problem and I would say one I have already detected in my work #digiforensics19

— Rachel MacGregor (@An_Old_Hand) March 15, 2019

What is the object of preservation: content or document? bit-stream or original physical trace?

@RiesThorsten re born-digital archival policies - here's the inaugural Cambridge University Libraries Digital Preservation Policy (2018): https://t.co/LOjoGLJes6 & also a tool for undertaking annual review of #digipres policy & strategy https://t.co/5upH8rrUDa #digiforensics19

— somaya langley (@criticalsenses) March 15, 2019

Digital forensics is about finding the materiality of the record that software design is trying to hide from users.

  • Mark Bell: Risk and Trust in the Archive. Blockchain and Preservation Risk Modelling.

Using stats to model risk re preservation in the digital archive: what are the factors that lead us to trust our material? .. but just because a file is horribly corrupted doesn't mena you can't find a way to read it, but when you do you can lose some confidence that the file is what it is

Mark Bell: Managing 'risk, uncertainty and trust' in relation to archives generally, but particularly digital archives, is a key priority for The National Archives #digiforensics19

— Jane Winters (@jfwinters) March 15, 2019
  • Jane Winters: Context, Interpretation and Trust: Working with Web Archives.

We think we know how to use a web archive because it looks like the live web. But it isn't .. digital archives impose artificial boundaries (often nation based) on the archived web .. 'what is an original' in the context of web archives doesn't make sense because of multi and overlapping crawls .. memento protocol looks for examples of web pages close to your search, but it constructs a hybrid page from fragments that exist, so a page that never existed. The good thing is that input from researchers has helped interfaces make the hybridity clear to the user .. web archives are fluid .. though fluidity could be seen as a problem, the multiplicity of copies can also help gain trust in objects

11:20 – 12:30

  • Elizabeth Lomas: RecordDNA - Exploring the Concept of the Digital Record and What Implications Are there for the Usability of the Future Evidence Base?

Where Are We? Where are We Now? am attending #digiforensics19 and it's reminded me of a conference paper presentation i gave in 2005 in Melbourne on complex born-digital objects: https://t.co/cwZiYcYu2Q

— somaya langley (@criticalsenses) March 15, 2019

What are the evidential needs of people who use records .. recognised need to draw on the past to construct the key components of trust in the digital record .. there are different visions of what a record can be .. prioritising the research agenda is hard as everyone sees every aspect of the record as important .. archivists are trying to lock down their content, but are they doing enough to help people interact with digital objects? ..

  • Fiona Courage: When the Paper Turned Digital: Transformation in the Mass Observation Archive.

Small archives and the challenges of digital .. by anonymising born-digital records, MOP have lost the forensic features of the born-digital archive, things they treasure in the paper archive

#digiforensics19 - @fionacourage @MassObsArchive - how brass paperclips have become signal of opening up a life story - BUT emails are different - concerned that the emotional Materiality may be lost. Discovering that #digital #forensics may provide the answer !

— Tim Gollins (@timgollins) March 15, 2019

#digiforensics19 In @fionacourage 's story, it's possible that file format migration can remove important contextual metadata, which puts the character of the @MassObsArchive at risk.

— Ed the Archivist (@EdwardPinsent) March 15, 2019

Really interesting discussion about the challenges of preserving anonymity when you are dealing with born-digital archival materials. People don't know how much information about themselves they will have left behind in a Word document, for example #digiforensics19

— Jane Winters (@jfwinters) March 15, 2019
  • Callum McKean: E-Mail in the Archive of Wendy Cope.

McLean is keen to expose workflows and processes so that others (researchers/practitioners) can comment on it - constant improvement #digiforensics19

— Rachel MacGregor (@An_Old_Hand) March 15, 2019

Prioritise getting the material over being too strict over how we get it .. appraisal and logical arrangement happens after extraction from the forensic .e01 file (no old and new orderings are kept) ..

Jonathan Pledge and Eleanor Dickens have written a really interesting article about their work with the born-digital material in the Wendy Cope archive https://t.co/RfVphULue8 #digiforensics19

— Jane Winters (@jfwinters) March 15, 2019

.. highly restricted access to the archives that make their way into the reading rooms: stuck in metaphor of making digital file analogous to paper archives, and this is particularly problematic for email .. very hard to verify the authenticity of senders of email when presented with an archive of emails received .. threaded nature of email means it is not amenable to the same logics as a letter .. Question: where is the cost benefit analysis on this?

#digiforensics19 - all of today's discussions run slap bang into #DataProtection issues. very challenging in #ethical #legal & #regulatory matters. @j_w_baker observing that digital access via reading room is if fact a massive step forward and to be welcomed.

— Tim Gollins (@timgollins) March 15, 2019

Jenny Bunn is the first person to raise the crucial question of resourcing all of the extra work involved with born-digital archives (while paper business as usual continues alongside) #digiforensics19

— Jane Winters (@jfwinters) March 15, 2019
  • Jenny Mitcham: The WordStar Files: the Truth Is out there .. Probably.

Marks and Gran archive (writers). Materiality of the digital. Examples from her work at the Borthwick Institute .. problem of software like Quick View Plus not representing a file correctly: bugs in how QV+ is presenting a file or feature of the original? .. so, had to go back to the original software ..

#digiforensics19 @Jenny_Mitcham - in addition to the "content" there was other "stuff" (thought to be unimportant and accidentally submitted initially .... later this view changed) shout out to "Quickview plus" - but how do we know Quickview is getting it right ?

— Tim Gollins (@timgollins) March 15, 2019

.. experience of 80s word processing lost when we emulate a file, we need the old manual to even understand how to use the software (e.g. keyboard shortcuts, logic related to a typewriter) .. print test page file shows how a page should look given that Wordstar is not WYSIWY: and by opening this in Quick View you get a sense of how the emulation differs to the original .. use of Wordstar installed on Windows 98 to use the file in a more (though not perfect) native environment

#digiforensics19 - I am struck that there may be something fundamental underlying these discussions - what is the "minimum viable" capture method at the start of #digipres - I do not think that there is ONE answer here but we need a small number of defined options i think.

— Tim Gollins (@timgollins) March 15, 2019

13:30 – 14:30

  • Helen McCarthy: Using the Archived Web for Historical Research: Some Preliminary Fieldnotes.

#digiforensics19 The conference that @HistorianHelen mentioned - 'Contemporary Political History in the Digital Age' - is something she wrote about for the @LSEImpactBlog back in 2016 https://t.co/kQuhfiUwcK

— James Baker (@j_w_baker) March 15, 2019

Writing a book on working mothers up to the present. Using official web archives, via for example National Archive Govt web archive. Materials - e.g. voices magazine - don't appear hugely challenging: it feels pretty static, resembles pamphlets, familiar structure. Of course consumed differently, but still within remit of what historians are trained to do .. Very different story for other websites though, example here Mumsnet. Found plenty of thread on topic of interest, the novel "I don't know how she does it". But cherry picking? More than this, integrity of chatroom not captured through typing out selected quotations. Social, dynamic, unstructured, chaotic, feels like listening in on a private conversation happening in public. Don't have tools for this. How do we deal with live site vs archived site? Can/should we use screengrabs of deleted material? We share these problems with social media researchers.

#digiforensics19 @HistorianHelen - Describing "raiding" mumsnet using keyword search - and realizing that the discursive space is complex and threaded - extracts are not necessarily "safe to quote" - tools to analyse chat rooms are slowly emerging.

— Tim Gollins (@timgollins) March 15, 2019
  • Kees Teszelszky: Oral History and Forensic Web Archaeology: the First Dutch Online Literature Magazine De Opkamer (1994-2000).

Less objects, less written sources, almost everything digital. What is authentic now? .. what are the human perspectives on this? .. De Opkamer was an early example of web archiving: it was selling zip files of its own history! ..

.@keesone showing us a brilliant case study of the oldest web archive in The Netherlands, a zip file described on the website of a Dutch literary magazine which was archived by the Internet Archive #digiforensics19

— Jane Winters (@jfwinters) March 15, 2019

I can’t possibly do justice to this brilliant presentation from @keesone - really getting to the human behind the digital #digiforensics19

— Rachel MacGregor (@An_Old_Hand) March 15, 2019

By trying to figure out how to archive the web, the KB has been creating oral histories around the web to contexualise it .. web archives are part of the live web that web archives are archiving .. digital libraries and archives are part of the primary sources themselves, the source becomes us ..

.@keesone concludes that 'Hybridity is the new normal' - this is not just paper and digital, but, e.g., web archives become part of the live web, which is then in turn archived as a new primary source #digiforensics19

— Jane Winters (@jfwinters) March 15, 2019
  • Lise Jaillant: Dark Archive - the Case of Born-Digital Records in the Carcanet Press Collection.

Using email archives is hard. Data protection is an 'issue'.

If you’re interested in our work with the born digital archives of Carcanet Press (including the thorny ethical and legal issues) check out #poetrysurvival for updates as the project progresses @lisejaillant @TheJohnRylands #digiforensics19

— Victoria Stobo (@vstobo) March 15, 2019

What do we need to do with (born digital) cultural #heritage in a world of Big Data according to @isejaillant: reach out to computer scientists (@AndreasNL!), apply for bigger grants, improve the training of graduate students (@silvertje!). #digiforensics19

— Kees Teszelszky (@keesone) March 15, 2019
  • Patricia Falcao: Authenticity and Forensic Practices in the Conservation of Digital Art.

What is authentic? With a computational artwork it is not what the artist gives you but what it does. Authenticity requires a join between display equipment, media, space, and playback equipment to create conceptual elements.

#digiforensics19 @PatriciaMFalcao - authenticity is very complex matrix of different factors (Media, Display equipment, space .....) - only happens at the time of "performance" !

— Tim Gollins (@timgollins) March 15, 2019

Not that it isn’t always taken seriously but the level effort required to satisfy the artist and curator is probably not repeatable in other contexts #digiforensics19 https://t.co/U7eBdjqqoP

— Rachel MacGregor (@An_Old_Hand) March 15, 2019

.@PatriciaMFalcao reflects on the level trust which exists between artist and institution - the “authenticity” and trust sits within that relationship. I think this is often true of archives/depositors #digiforensics19

— Rachel MacGregor (@An_Old_Hand) March 15, 2019

#digiforensics19 Every use case at this excellent workshop reveals another dimension of what "authenticity" might mean in digital preservation. I'm all for this multiplicity.

— Ed the Archivist (@EdwardPinsent) March 15, 2019

.@PatriciaMFalcao noting that it may be necessary to have access to artists' work environments in order to preserve their art effectively #digiforensics19

— Jane Winters (@jfwinters) March 15, 2019

## 15:00 – 15:45

  • Rachel McGregor: Leaving Fewer Traces: Tackling the Digital Legacy of Eric Hobsbawm.

#digiforensics19 @An_Old_Hand - represents a first go at #digitalforensics - only a small proportion of Eric Hobsbawm papers are born digital. 15floppy disks in all - Working with Bit Curator @bitconsortium

— Tim Gollins (@timgollins) March 15, 2019

Connection between physical storage device and digital files, e.g. file list written on the disk .. many of the items not 'unique', for their are physical versions of the same documents .. things like BitCurator won't do it all .. how important is it for a file to render correctly? .. with personal papers, both ontology issues (original order is different with physical and digital) and the sequencing of archival practice, makes arrangement of papers likely to separate out the digital things as a separate section.

Insight from the creator - he states that he didn't keep drafts of his work and would delete the digital copy once the final version was published. Will the digital archive reflect this? @An_Old_Hand #digiforensics19

— Jenny Mitcham (@Jenny_Mitcham) March 15, 2019
  • Gabor Palko: Born Digital Manuscripts in a Literary Archive: Examples and Problems.

Hearing a use case about an old PC motherboard treated as a "museum relic"... shades of "A Canticle For Leibowitz". #digiforensics19

— Ed the Archivist (@EdwardPinsent) March 15, 2019

Hard drives stored as a relic in an art collection, with boots, pencils, hats, et cetera ..

I am recognising all of Palko’s descriptions of “finding” digital carriers lurking in the #archive My top tip folks: make a thorough digital asset register #digiforensics19

— Rachel MacGregor (@An_Old_Hand) March 15, 2019

#digiforensics19 Gabor Palko - Drawing valuable distinction between approach of special collection (literary archive - high Materiality value ) and public #archives (in the later little specific materiality value) - results in different approach to forensics (paper and digital)

— Tim Gollins (@timgollins) March 15, 2019

Gabor Palko (Hungary): discovered a "personal computer stored as a relic in the collection" & nothing had been done with it. so they cloned the HDD & discovered browser history, deleted data (that could be recovered) etc. #digiforensics19

— somaya langley (@criticalsenses) March 15, 2019

Discussion around the problems of lacking policies and procedures around ingest of and access to born digital collections #digiforensics19

— Rachel MacGregor (@An_Old_Hand) March 15, 2019
  • Jenny Bunn: Redefining Archival Processing

Archivists found digital in 1999 (Ross and Gow) https://core.ac.uk/download/pdf/42357524.pdf .. use Bitcurator in teaching: recreate graphical in the command line .. going back to basics of bitstream, file system, file walk .. archival labour is naturalised as invisible because archivists want to be recognised for their work whilst at the same time want to present them work as objective reality .. how much work needs to be done by the archivist for a researcher to stop thinking that they are working with raw data? .. archival agency is great until you think you should do everything yourself.


Some admin...

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Exceptions: embeds to and from external sources, and direct quotations from speakers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment