Re: [AUDITORY] F.O.S.S auditory data server 20TB - feasibility/utility ? (Nathan Barlow )


Subject: Re: [AUDITORY] F.O.S.S auditory data server 20TB - feasibility/utility ?
From:    Nathan Barlow  <nb.audiology@xxxxxxxx>
Date:    Tue, 15 Oct 2024 06:08:37 +0100

--000000000000d9056e06247cf278 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi S Thanks for your email and advise and well-intended warnings and heads-up Wanted to clarify that it wouldn't be hosting peoples datasets for them, but rather opening up to the scientific community, like open source. Inspired by my own studies with EEG, and listening to computational audiology and computational sciences ..."what could someone else do with my EEG data of my own brain". This project is to explore if there's any interest out there. It would be people who have data , can put it into a pool which is then available to everyone (via, yes, "hosting" - but its actually SSH connection to the box, so its not "on the Internet" for everyone and their dog) It's not backup, nor the only copy, nor hosting. Users would be advised to obviously have their own local copy, backups, etc that they normally use. They maintain their own data as per their own local regulations and or needs. This NAS if you want to call it that, was 300 in total , using enterprise grade hardware - what's called e-waste. Corporations dumping gear 4 year cycle. So it's currently 8GB RAM on FM2+ dual core (this is 2011 era hardware capable of 64GB RAM but thats just speed of access and multitasking, so lower on the agenda) with 10 drive spaces , occupied by 5 drives at the moment. I anticipate the next cycle of big corporations dumping their "old" gear, specifically hard drives, to be 2TB or 4TB drives , so adding those in would boost from 5TB (useble is only 3TB due to ZFS2) to adding ten TB for 15 minus 2, a total of 13TB. Or even add 20, for 23TB. That cycle is two years away realistically. So 3TB for now. Each drive sells for peanuts (think the price of three craft beer pints) compared to retail value, as corporations simply wish to move to bigger or faster NVME style SSD drives due to more energy efficient , smaller, more reliable, and something like 4000x faster speed. More tech details on my blog www.eresope.wordpress.com My question and email to AUDITORY List was more about seeing what people in our fields might imagine we could do with such a dataset specifically about hearing. Correlational studies? large data sets? Computational Audiology Network is an example of these types of concepts out there. Best - Nathan On Fri, Sep 27, 2024, 1:28 PM Fastmail <sammosummo@xxxxxxxx> wrote: > Hi Nathan, > > > > Respectfully, this sounds like a very bad idea for many reasons. Don=E2= =80=99t > make yourself responsible for maintaining other people's data. I would ev= en > advise against building a NAS to store your own data, as there are plenty > of ways to securely store ~20TB that will be easier, cheaper, and quicker > than self-hosting. > > > > S. > > > > *From: *AUDITORY - Research in Auditory Perception < > AUDITORY@xxxxxxxx> on behalf of Nathan Barlow < > nb.audiology@xxxxxxxx> > *Date: *Friday, September 27, 2024 at 12:15=E2=80=AFAM > *To: *AUDITORY@xxxxxxxx <AUDITORY@xxxxxxxx> > *Subject: *[AUDITORY] F.O.S.S auditory data server 20TB - > feasibility/utility ? > > Dear AUDITORY List members, > > Would there be any interest from members in a collective server with a > data array of 20TB; the aim being able to secure store Auditory data from > human participants in line with Human ethics boards requirements on data > longevity. Think large open data sets. Think being able to browse your > dataset for new insights in a new way with their specialism. > > Myself,I have found that long-term storage required a specialised array t= o > extend past the 10year mark. This is for example the finalised dataset of > semi or even fully anonymised data, or equally the large size original > recordings (video, EEG, etc) that span several GB a file. > > A description of the server can be found at: > https://tinyurl.com/auditory-serve > > (note the drives in the actual server are six 5Tb HDD's in an array, with > data longetivty exceeding 15 years- but the description shows six 1Tb - a= n > older configuration) > It runs Linux but SMB based (allowing Windows compatibility) and exFAT > (limiting to 4GB per file) and using SSH currently for secure connection > and data transfer. > > > I look forward with interest to any correspondence with AUDITORY members > and their learned opinions on wether such a open source, non-profit serve= r > is of use within our auditory neuroscientific community, or indeed even > hearing about your existing options in this space. > > nga mihi > Nathan > > > > -- > > Nathan Barlow > > BSc, PGDip, MSc(SpchSci)(Hons), CoP, MSc(Clinical Audiology)(Soton) > > www.eresope.wordpress.com > > @xxxxxxxx > > > --000000000000d9056e06247cf278 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto">Hi S<div dir=3D"auto"><br></div><div dir=3D"auto">Thanks = for your email and advise and well-intended warnings and heads-up=C2=A0<br>= <div dir=3D"auto"><br></div><div dir=3D"auto">Wanted to clarify that it wou= ldn&#39;t be hosting peoples datasets for them, but rather opening up to th= e scientific community, like open source.</div><div dir=3D"auto"><br></div>= <div dir=3D"auto">Inspired by my own studies with EEG, and listening to com= putational audiology and computational sciences ...&quot;what could someone= else do with my EEG data of my own brain&quot;. This project is to explore= if there&#39;s any interest out there.=C2=A0</div><div dir=3D"auto"><br></= div><div dir=3D"auto">It would be people who have data , can put it into a = pool which is then available to everyone (via, yes, &quot;hosting&quot; - b= ut its actually SSH connection to the box, so its not &quot;on the Internet= &quot; for everyone and their dog)</div><div dir=3D"auto"><br></div><div di= r=3D"auto">It&#39;s not backup, nor the only copy, nor hosting. Users would= be advised to obviously have their own local copy, backups, etc that they = normally use. They maintain their own data as per their own local regulatio= ns and or needs.=C2=A0</div><div dir=3D"auto"><br></div><div dir=3D"auto">T= his NAS if you want to call it that, was 300 in total , using enterprise gr= ade hardware - what&#39;s called e-waste. Corporations dumping gear 4 year = cycle. So it&#39;s currently 8GB RAM on FM2+ dual core (this is 2011 era ha= rdware capable of 64GB RAM but thats just speed of access and multitasking,= so lower on the agenda) with 10 drive spaces , occupied by 5 drives at the= moment. I anticipate the next cycle of big corporations dumping their &quo= t;old&quot; gear, specifically hard drives, to be 2TB or 4TB drives , so ad= ding those in would boost from 5TB (useble is only 3TB due to ZFS2) to addi= ng ten TB for 15 minus 2, a total of 13TB. Or even add 20, for 23TB. That c= ycle is two years away realistically. So 3TB for now.=C2=A0 Each drive sell= s for peanuts (think the price of three craft beer pints) compared to retai= l value, as corporations simply wish to move to bigger or faster NVME style= SSD drives due to more energy efficient , smaller, more reliable, and some= thing like 4000x faster speed.=C2=A0</div><div dir=3D"auto">More tech detai= ls on my blog <a href=3D"http://www.eresope.wordpress.com">www.eresope.word= press.com</a></div><div dir=3D"auto"><br></div><div dir=3D"auto">My questio= n and email to AUDITORY List was more about seeing what people in our field= s might imagine we could do with such a dataset specifically about hearing.= Correlational studies? large data sets?=C2=A0 Computational Audiology Netw= ork is an example of these types of concepts out there.=C2=A0</div><div dir= =3D"auto"><br></div><div dir=3D"auto">Best</div><div dir=3D"auto">- Nathan= =C2=A0</div><br><br><div class=3D"gmail_quote" dir=3D"auto"><div dir=3D"ltr= " class=3D"gmail_attr">On Fri, Sep 27, 2024, 1:28 PM Fastmail &lt;<a href= =3D"mailto:sammosummo@xxxxxxxx">sammosummo@xxxxxxxx</a>&gt; wrote:<= br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde= r-left:1px #ccc solid;padding-left:1ex"> <div lang=3D"EN-US" link=3D"blue" vlink=3D"purple" style=3D"word-wrap:break= -word"> <div class=3D"m_-7550661364212971271WordSection1"> <p class=3D"MsoNormal"><span style=3D"font-size:11.0pt">Hi Nathan,<u></u><u= ></u></span></p> <p class=3D"MsoNormal"><span style=3D"font-size:11.0pt"><u></u>=C2=A0<u></u= ></span></p> <p class=3D"MsoNormal"><span style=3D"font-size:11.0pt">Respectfully, this = sounds like a very bad idea for many reasons. Don=E2=80=99t make yourself r= esponsible for maintaining other people&#39;s data. I would even advise aga= inst building a NAS to store your own data, as there are plenty of ways to securely store ~20TB that will be easier, cheaper, a= nd quicker than self-hosting.<u></u><u></u></span></p> <p class=3D"MsoNormal"><span style=3D"font-size:11.0pt"><u></u>=C2=A0<u></u= ></span></p> <p class=3D"MsoNormal"><span style=3D"font-size:11.0pt">S.<u></u><u></u></s= pan></p> <p class=3D"MsoNormal"><span style=3D"font-size:11.0pt"><u></u>=C2=A0<u></u= ></span></p> <div id=3D"m_-7550661364212971271mail-editor-reference-message-container"> <div> <div> <div style=3D"border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in = 0in 0in"> <p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><b><span style=3D"col= or:black">From: </span></b><span style=3D"color:black">AUDITORY - Research in Auditory Perc= eption &lt;<a href=3D"mailto:AUDITORY@xxxxxxxx" target=3D"_blank" re= l=3D"noreferrer">AUDITORY@xxxxxxxx</a>&gt; on behalf of Nathan Barlo= w &lt;<a href=3D"mailto:nb.audiology@xxxxxxxx" target=3D"_blank" rel=3D"no= referrer">nb.audiology@xxxxxxxx</a>&gt;<br> <b>Date: </b>Friday, September 27, 2024 at 12:15</span><span style=3D"font-= family:&quot;Arial&quot;,sans-serif;color:black">=E2=80=AF</span><span styl= e=3D"color:black">AM<br> <b>To: </b><a href=3D"mailto:AUDITORY@xxxxxxxx" target=3D"_blank" re= l=3D"noreferrer">AUDITORY@xxxxxxxx</a> &lt;<a href=3D"mailto:AUDITOR= Y@xxxxxxxx" target=3D"_blank" rel=3D"noreferrer">AUDITORY@xxxxxxxx= LL.CA</a>&gt;<br> <b>Subject: </b>[AUDITORY] F.O.S.S auditory data server 20TB - feasibility/= utility ?<u></u><u></u></span></p> </div> <div> <p style=3D"margin-bottom:12.0pt">Dear AUDITORY List members,<br> <br> Would there be any interest from members in a collective server with a data= array of 20TB; the aim being able to secure store Auditory data from human= participants in line with Human ethics boards requirements on data longevi= ty. Think large open data sets. Think being able to browse your dataset for new insights in a new way with= their specialism.<br> <br> Myself,I have found that long-term storage required a specialised array to = extend past the 10year mark. This is for example the finalised dataset of s= emi or even fully anonymised data, or equally the large size original recor= dings (video, EEG, etc) that span several GB a file. <u></u><u></u></p> <p style=3D"margin-bottom:0in">A description of the server can be found at:= <a href=3D"https://tinyurl.com/auditory-serve" target=3D"_blank" rel=3D"no= referrer"> https://tinyurl.com/auditory-serve</a><u></u><u></u></p> <p style=3D"margin-bottom:0in">(note the drives in the actual server are si= x 5Tb HDD&#39;s in an array, with data longetivty exceeding 15 years- but t= he description shows six 1Tb - an older configuration)<br> It runs Linux but SMB based (allowing Windows compatibility) and exFAT (lim= iting to 4GB per file) and using SSH currently for secure connection and da= ta transfer. <u></u><u></u></p> <p style=3D"margin-bottom:0in"><br> I look forward with interest to any correspondence with AUDITORY members an= d their learned opinions on wether such a open source, non-profit server is= of use within our auditory neuroscientific community, or indeed even heari= ng about your existing options in this space. <br> <br> nga mihi<br> Nathan <u></u><u></u></p> <p class=3D"MsoNormal"><br clear=3D"all"> <br> <span class=3D"m_-7550661364212971271gmailsignatureprefix">-- </span><u></u= ><u></u></p> <div> <div> <p class=3D"MsoNormal">Nathan Barlow<u></u><u></u></p> <div> <p class=3D"MsoNormal"><span style=3D"font-size:7.5pt;color:#666666;backgro= und:white">BSc, PGDip, MSc(SpchSci)(Hons), CoP, MSc(Clinical Audiology)(Sot= on)</span><u></u><u></u></p> </div> <div> <p class=3D"MsoNormal"><span style=3D"color:black;background:white"><a href= =3D"http://www.eresope.wordpress.com" target=3D"_blank" rel=3D"noreferrer">= www.eresope.wordpress.com</a></span><u></u><u></u></p> </div> <div> <p class=3D"MsoNormal"><span style=3D"color:black;background:white">@xxxxxxxx= pe</span><u></u><u></u></p> </div> <div> <p class=3D"MsoNormal"><u></u>=C2=A0<u></u></p> </div> </div> </div> </div> </div> </div> </div> </div> </div> </blockquote></div></div></div> --000000000000d9056e06247cf278--


This message came from the mail archive
postings/2024/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University