Bitácora de fernand0 Cambiando de aires

La preservación digital y los riesgos actuales

Incunables en su contenedor

De vez en cuando hablamos de preservación digital y, en particular, de la web. En As the Trump administration purges web pages, this group is rushing to save them nos recuerdan que una parte de la web está siendo borrada de manera activa.

No solo páginas federales, que en algunos casos volvían modificadas, sino también conjuntos de datos (fundamentalmente relacionados con ciencia y medio ambiente).

After President Trump's inauguration in January, some federal web pages vanished. While some pages were removed entirely, many came back online with changes that the new administration's officials said were made to conform to Trump's executive orders to remove "diversity, equity, inclusion, and accessibility policies." Thousands of datasets were wiped — mostly at agencies focused on science and the environment — in the days following Trump's return to the White House.

Nos recuerdan de la existencia del Internet Archive y de su valioso trabajo, preservando sitios que van desapareciendo con el tiempo.

The nonprofit, founded in 1996, is a digital library of internet sites and cultural artifacts. This includes hundreds of billions of copies of government websites, news articles and data. The Wayback Machine is the archive's access point to nearly three decades of web history.

Cada día descargan varios teras de información, que archivan adecuadamente y luego ofrecen para su revisión.

Every day, about 100 terabytes of material are uploaded to the Internet Archive, or about a billion URLs, with the assistance of automated crawlers. Most of that ends up in the Wayback Machine, while the rest is digitized analog media — books, television, radio, academic papers — scanned and stored on servers.

Creo que es un proyecto que merece la pena apoyar (independientemente de quién sea el presidente de turno de los EEUU, su trabajo sería valioso aunque no hubiera alguien tratando de borrar información).

Y nos recuerdan que, incluso pasa en la Wikipedia: un número importante de sus enlaces corresponden a páginas que ya no existen, pero cuyo contenido se puede conocer gracias a este proyecto.

"I don't remember the page but, you know, a significant percentage of the links that were on the Wikipedia article are Internet Archive links," he said. "That is really sad — that what people view as a primary source is something that doesn't exist anymore."

También tienen amenazas legales, que pretenden desanimar el archivado y la labor de esta organización.

Founder Kahle said the costly lawsuits — which legal experts say are meant to be a deterrent — threaten the future of the archive.