Web Archives: A Step into Libraries of the Future

Submitted by jkolosov on October 20, 2018 - 11:45am
  • Facebook post of the Sonoma Developmental Center on October 15, 2017 from the North Bay Fires web archive photo

    Facebook post of the Sonoma Developmental Center on October 15, 2017 from the North Bay Fires web archive

For many people, libraries are where they go for the hottest “lucky day” books, or season five of The Americans, or to stream the latest Grammy winners. But public libraries are also where people go when they need one-of-a-kind resource materials such as a 1967 telephone book for the town of Geyserville or a copy of a hand-drawn diseño map of Rancho Petaluma.

Following the October 2017 wildfires, journalists and others came to Sonoma County History & Genealogy Library to learn about the historic development of Fountain Grove, Coffey Park and other areas affected by fire. They reviewed Board of Supervisors and Sonoma County Planning Commission minutes from the county archives to gain an understanding of past land use decisions. Newspaper clippings relating to the 1964 Hanly fire were also accessed. Pause for a moment and consider what the primary sources of the year 2018 will look like. What formats will they consist of? How should libraries be collecting and archiving these records?

As records of social life and community interaction increasingly live online, public libraries are recognizing the need to capture and preserve these digital traces as the primary sources for tomorrow's researchers. Following the 2017 fires, Sonoma County Library partnered with the Internet Archive to embark on just such an endeavor, joining 26 other public libraries in a grant-funded program called Community Webs. Sonoma County Library took the opportunity to build a web archive of websites, news, and social media content related to the fires – taking “snapshots” of the aftermath and recovery efforts as they were shared online. The North Bay Fires web archive documents the websites of the County of Sonoma and the City of Santa Rosa, including the communication of their recovery and resiliency services to the public. The archive also contains the websites of groups that formed out of the fires like Coffey Strong and UndocuFund, in addition to blog posts, Facebook posts and tweets reflecting a range of emotions from shock and anger to heartbrokenness and hope. The web archive supplements the oral histories, wildfire stories, artwork, poetry and prose, and artifacts gathered and exhibited by other local organizations – together revealing the many facets of lived experiences of the residents of Sonoma County.


Follow this link to view the North Bay Fires 2017 web archive hosted by Archive It, a web archiving service offered through the Internet Archive, along with many other web archives of our recent past including collections on Katrina, Black Lives Matter, and global human rights. You can browse or search among collections by topic or collecting organization, or search within collections using tabs to access a list of “sites,” or perform a text level search for a particular word or phrase. Clicking on the title or URL of an entry will bring you to a calendar page showing the dates on which the URL was captured. Selecting a specific date will bring you into the Internet Archive’s Wayback machine to view the web content as it looked on that particular date.  

Highlights of the collection include the Facebook page of the Sonoma Developmental Center, the Berkeley firefighters’ video, the Go Fund Me – Northern California fire relief webpage, the website of the Sonoma Ecology Center featuring time-lapse videos of vegetation regrowth, the Santa Rosa Fire Department’s Twitter page, and a Youtube video of the Sonoma County Day of Remembrance (October 28, 2017).

Web archiving technologies are in constant evolution, trying to keep pace with the dynamic Web. You may encounter missing content or broken links as you navigate the North Bay Fires web archive – that is the nature of trying to capture digital content from a variety of ever-changing structures and sites. Please be patient as we test these new tools. The challenging mechanics of web archiving, as well as the ethical issues it raises, make web archiving a ripe domain for more evaluation and research, which is occurring in projects such as Documenting the Now, an effort to collect and preserve digital content from Twitter while respecting the rights of content creators. Nevertheless, institutions like the Library of Congress are moving ahead in the domain of digital stewardship, recently releasing 4,240 new web archives across 43 event and thematic collections.

The October fires prompted this “pilot” project which the library can now assess as it considers expanding its collecting and archiving of online content. It’s really not a matter of whether public libraries should archive web content but rather how libraries can join with other local institutions and organizations to take collective responsibility for preserving the primary sources of the future.

For more information, contact Joanna Kolosov, jkolosov (at) sonomalibrary (dot) org

Share this on: 
Share page with AddThis