CERN's storage swells beyond the exabyte barrier for LHC(www.theregister.com)

10 years ago I worked at a university that had a couple people doing research on LHC data. I forget the specifics but there is a global tiered system for replication of data coming from the LHC so that researchers all around the world can access it.

I probably don’t have it right, but as I recall, raw data is replicated from the LHC to two or three other locations (tier 1). The raw data contains a lot of uninteresting data (think a DVR/VCR recording a blank TV image) so those tier 1 locations analyze the data and removes all that unneeded data. This version of the data is then replicated to a dozen or so tier 2 locations. Lots of researchers have access to HPC clusters at those tier 2 locations in order to analyze that data. I believe tier 2 could even request chunks of data from tier 1 that wasn’t originally replicated in the event a researcher had a hunch there might actually be something interesting in the “blank” data that had originally been scrubbed.

The university where I worked had its own HPC cluster that was considered tier 3. It could replicate chunks of data from tier 2 on demand in order to analyze it locally. The way it was mostly used was our researchers would use tier 2 to do some high level analysis, and when they found something interesting they would use the tier 3 cluster to do more detailed analysis. This way they could throw a significant amount of our universities HPC resources at targeted data rather than competing with hundreds of other researchers all trying to do the same thing on the tier 2 clusters.

permalink

report

parent

datahoarder

!datahoarder@lemmy.ml

Create post

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data – legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they’re sure it’s done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we’re trying really hard not to forget.

– 5-4-3-2-1-bang from this thread

Community stats

472
Monthly active users
111
Posts
558
Comments

Community moderators

-Archivist@lemmy.ml