Email or username:

Password:

Forgot your password?
Top-level
Chris Trottier

Actually, I kind of wish there was a universal P2P protocol that was a mixture of HTTP and BitTorrent.

That alone would fix so many problems with the Internet!

27 comments
Chris Trottier

@anji I've used IPFS for three years, and I've yet to see broad adoption apart from crypto.

If something like archive.org were stored on IPFS, then that would be a game changer.

Polychrome :clockworkheart:

@atomicpoet @anji to use your example, neither PeerTube or IPFS are good archiving solutions. Google can't be trusted long term but IPFS forgets data as soon as someone doesn't pin it and PeerTube instances tend to shut down within months. I've been trying to use both and it's been very unreliable.

Disclaimer - I'm extremely pro-decentralization and pester everyone I know about it but I am also something of an archivist and I can't ignore the reality.

In the end long term data storage has to be done over offline media - which is also becoming a problem as more people switch to flash based storage since if you leave it unpowered for a couple of years the data will go corrupt and eventually useless much faster than on classic magnetic media.

The net will always be ephemeral on the long term unless we work up something insane like Xanadu. So for now the most stable online archiving option is with people who care - e.g. archive.org.

@atomicpoet @anji to use your example, neither PeerTube or IPFS are good archiving solutions. Google can't be trusted long term but IPFS forgets data as soon as someone doesn't pin it and PeerTube instances tend to shut down within months. I've been trying to use both and it's been very unreliable.

Disclaimer - I'm extremely pro-decentralization and pester everyone I know about it but I am also something of an archivist and I can't ignore the reality.

Chris Trottier

@polychrome @anji Okay, but hear me out. What if archive.org was decentralized, and every library and university in the world ran an instance?

Konrad

@atomicpoet I'm sure hard disk companies would quite love that. @polychrome @anji

Doc Edward Morbius ⭕​

@atomicpoet Just to oput some numbers on this...

The US has about 9,000 public libraries (administrative units) and another 3,000 or so academic libraries, for a total of 12,000 of both classes.

There is an estimated total of over 115,000 libraries in the country. (Many are public school libraries.)

web.archive.org/web/2018102617libguides.ala.org/numberoflibr

I'm going to assume that major-city public libraries and top academic libraries might be considered archival hubs. That's a hundred or so from each list, conservatively.

The US Library of Congress holds 40 million catalogued works (books, generally, a total of ~130 million items of various descriptions).

At 5 MB/book, total disk storage would run about $3,700 ($18.3/TB), for spinning rust. Other offline / nearline storage might be cheaper. I'm going to estimate a disk storage system at roughly 4x this cost, or just under $15,000. (This is probably high, I'm being conservative.)

That is, for $15,000, any library in the world could hold the entire works of the world's largest library, the Library of Congress.

For comparison, the Internet Archive budgets $2/GB for data in perpetuity. That's $2 per 400 books or so.

Yes, "books" != "Internet data". But it's a comparison point.

@polychrome @anji

@atomicpoet Just to oput some numbers on this...

The US has about 9,000 public libraries (administrative units) and another 3,000 or so academic libraries, for a total of 12,000 of both classes.

There is an estimated total of over 115,000 libraries in the country. (Many are public school libraries.)

naxxfish

@polychrome @atomicpoet @anji it's true - you cannot trust anyone who's motives are not preservation to preserve your media. Archiving has been getting increasingly difficult and expensive over the years as the volume and diversity of media goes up, and it's expensive.

I'd go one further and say optical media - not necessarily CD/DVD, though - is the way to go - formed by irreversible chemical/ mechanical processes. Tapes and disks are fine - but are erasable and so less durable.

naxxfish

@polychrome @atomicpoet @anji also - tape heads have a finite lifetime (in hours read). Many kinds of tape machines (and this heads) which were once common are no longer manufactured: thus there is a finite supply of tape heads. There are archives in the world which have more hours of media stored in them than there are tape head hours in the world. So some of the archive is already lost - it's just we have to decide which bit we don't recover.

Rachael Ava 💁🏻‍♀️

@polychrome @atomicpoet @anji LTO Tape drives are quite popular for archival purposes, as they don't lose data easily when not in use for a long time.

Miłosz SP9UNB

@polychrome @atomicpoet @anji You can't expect non-commercial instance to be reliable if you don't support it. Support instance (mastodon, peertube, etc..) or setup/share your own and will last forever.

Terry Hancock

@polychrome @atomicpoet @anji

After researching this problem for myself, I settled on two offline storage media:

1) I buy used 1-TB 3.5" hard drives and offline-storage cases.
The 1-TB size is a good match to my needs, cheap to buy (especially used, and used is fine -- for offline use, they'll get little wear).

2) M-Disc optical disks, DVD-M or M-Disc BDR, which are much more durable than dye-based media (will probably outlast the magnetic media of the hard disks).

@polychrome @atomicpoet @anji

After researching this problem for myself, I settled on two offline storage media:

1) I buy used 1-TB 3.5" hard drives and offline-storage cases.
The 1-TB size is a good match to my needs, cheap to buy (especially used, and used is fine -- for offline use, they'll get little wear).

Terry Hancock

@polychrome @atomicpoet @anji

I plan to use the 3.5"HDs to store expanded source trees, software/distro, and the complete EXR-stream renders of my output.

The EXRs are "intermediates". Regenerating them is expensive, but it is an automatic process, once the software is running.

I'll use the optical M-Disc media to store source files, PNG streams, video renders, and software archives.

As for the volatility of PeerTube, that's why I'm running my own instance, now.

Hopefully this works. 🤞

teledyn 𓂀

@polychrome @atomicpoet @anji the recent threat by #Google to oust #GSuiteLegacy users who wouldn't pay the ransom gave a new perspective on the net-archives, faced as I was with somehow finding new homes for 16 years worth of life-history data for 8 users.

I think we must accept that "Digital Archive" is a contradiction in terms, a transient transport from A to B. Digital 'artifacts' are a 'volatile' variable contained within a scope that will inevitably be garbage collected.

Ton Zijlstra

@polychrome @atomicpoet @anji and/or archive yourself what you need/want to.

Leonie :pb:​ :22breadinv: :vf:

@atomicpoet @anji It looks like archive.org is actually planning to use IPFS. Have to look up the source later, currently at work.

Leonie :pb:​ :22breadinv: :vf:

@atomicpoet @anji ok, here's a follow up to this. Looks like they removed any evidence for that, the only thing I could fine was a cut version of the interview on archive.org. The whole interview isn't available anymore.

Tavi

@anji @atomicpoet ipfs backs a lot of Library Genesis. It's working well enough for them. I think it's just a matter of time.

Doc Edward Morbius ⭕​

@atomicpoet IPFS is one of the distribution mechanisms used by Library Genesis.

@anji

hacknorris

@atomicpoet
Theoretically possible but its only a theory and a number..

Jens Finkhäuser

@atomicpoet Welcome to the #interpeer project, or at least where I hope it'll be in a few months.

Jens Finkhäuser

@atomicpoet For something like archiving, IPFS may actually be a good choice. The goal for us is broader, and also support real-time scenarios such as live broad-/manycasting, as well as collaborative editing.

IPFS has a few features that mean it doesn't lend itself all that well to those scenarios where there are frequent updates to a resource.

hkc (Carbonated)

@atomicpoet imo that would be great not only from data preservation standpoint, but also just because it's more convenient. I can't even count how many times I had to waste time waiting for slow CDNs that are thousands of kilometers away to give me data I want. Decentralization not only makes it more persistent, but also makes content delivery much faster. But I guess at that point in time average person won't really care about that, unfortunately.

Vftdan

@atomicpoet
There is #ZeroNet, but it has some problems & now it does not seem to be very popular

ansuz / ऐरन

@atomicpoet that's the kind of talk that floods your mentions with IPFS people

Go Up