Find White Papers
Home
About Us
List Your Papers
    
> HP > Coping with the Explosion of Data in Life Sciences Research

Coping with the Explosion of Data in Life Sciences Research

White Paper Published By: HP

The emergence of genomics and advanced gene sequencing techniques has made the collection and storage of data a centerpiece of biomedical research. As the data generated in biomedical research becomes richer and richer, having the infrastructure in place to deal with data growth efficiently is going to be a cornerstone of biomedical data management. This white paper examines a joint solution that features data reduction technologies combined with a network-attached storage system that offers storage optimization capacities along with an affordable, manageable, and scalable petabyte-ready storage platform.



Tags : 
hp, data, storage, genomics, backup and recovery, network attached storage, storage management

HP
Published:  Aug 03, 2009
Type:  White Paper
Length:  6 pages

Coping with the Explosion of Data
in Life Sciences Research
How storage is managed will either advance-
or slow-the pace of biomedical discovery
www.ocarinanetworks.com1
The emergence of genomics and advanced gene sequencing techniques has made the collection and storage of data a centerpiece of biomedical research. It was not that long ago that the human genome project first sequenced the human genome, as part of an international cooperative research effort. That first sequence took up about 750 gigabytes in 2000 - an amount of data that would fit on a single disk today. However, genomics research has rapidly moved past the first basic sequencing of the human genome, and now research advances are made through increasingly sophisticated sequencing machines and technologies. Today, research institutions, universities, pharmaceutical companies and even hospitals generate genomic data almost continually. A modern lab might generate as much as 10 terabytes a day of data.
For example, Cornell University's computational biology service unit, which supports life sciences across its many research facilities and hospitals throughout New York State, often collects as much as a terabyte a day from each of its many sources. Putting the data onto tape backups is not ideal, as many researchers need immediate, fast access to a "hot copy" of the gene sequencing data they are analyzing.
"As scientific researchers acquire data at faster and faster rates, optimizing the analysis of that data with scalable storage solutions is essential," said Dr. David Lifka, Cornell Center for Advanced Computing director. "Despite advances in disk technology, storing research data remains an expensive proposition," he explained. "Ocarina provides a cost-effective way to maximize storage capacity without sacrificing performance."
This is a field where technology is advancing very quickly, and the next generation of machines from leading companies like Illumina and Affymetrix will generate even richer data - and require even more storage to hold that data. Because knowledge in the field is moving so rapidly, the value in the data may not now be completely understood - so keeping the data long term for analysis could hold great value for research. However, as the amount of data generated grows, the burden on biomedical researchers to capture it and store it puts them at the center of a problem facing many parts of IT - coping with massive data growth.
For the most part, life sciences researchers are not storage experts, nor do they have a long history of running the world's largest data stores. They are being put in this position by the fast increase in the amount of rich data being generated by gene sequencers, ChIP sequencers, and other advanced technology. What's daunting is that the next generation of analyzers, sequencers and other genomics technology will generate even more data.
In fact, storage is such a crucial piece of the puzzle that it is entirely possible that the pace of genomics research will be slowed by the inability of researchers to deal with the onslaught of data. This could mean a slowdown in finding cures and treatments to the world's most pressing medical crises, such as cancer, heart disease, and many other diseases and conditions. Money to purchase storage, staff to manage it, data center space to keep it, and energy to power and cool it will all become important factors in overall research budgets - money that might otherwise be spent on research itself.
Content-Aware Compression and Data Deduplication for Online Storage2
Backups Present Further Challenges
Another challenge with the overwhelming amount of data growth is the strain it puts on traditional backups. When data comes in at 10 terabytes or more per day, backing up to tape the old way becomes unfeasible. Data reduction with content-aware compression and deduplication offers other alternatives for data protection and retention. Once the primary copy of the data has been processed and reduced down to one-tenth its original size, it may make sense to create a replica of that data on another storage platform, rather than trying to back it up using legacy backup tools or tape.
The replica can be stored in another location, on cheaper storage than production data. This serves the purpose of protecting data and making all data accessible in the event of a data loss on primar... [download for more]

Browse Technology Topics

Data Center

Virtualization, Cloud Computing, Infrastructure, Design and Facilities, Power and Cooling, Green Computing  
    

Data Management

Application Integration, Analytical Applications, Business Intelligence, Configuration Management, Database Development, Data Integration, Data Mining, Data Protection, Data Quality, Data Replication, Database Security, EDI, SOAP, Service Oriented Architecture, Web Service Management, Data Warehousing  
    

Enterprise Applications

Application Integration, Application Performance Management, Best Practices, Business Activity Monitoring, Business Analytics, Business Integration, Business Intelligence, Business Management, Business Metrics, Business Process Automation, Business Process Management, Call Center Management, Call Center Software, Change Management, Corporate Governance, Customer Interaction Service, Customer Relationship Management, Customer Satisfaction, Customer Service, EBusiness, Enterprise Resource Planning, Enterprise Software, EProcurement, Extranets, Groupware Workflow, HIPAA Compliance, IP Faxing, IT Spending, Marketing Automation, Performance Testing, Product Lifecycle Management, Project Management, Return On Investment, Risk Management, Sales & Marketing Software, Sales Automation, Server Virtualization, Simulation Software, Supply Chain Management, System Management Software, Total Cost of Ownership, Video Conferencing, Voice Recognition, Voice Over IP, Workforce Management, Incentive Compensation, Spend Management, Manufacturing Execution Systems, International Computing  

Human Resource Technology

Human Resources Services, Payroll Software, Time and Attendance Software, Workforce Management Software, Financial Management, Employee Monitoring Software, Employee Training Software, Recruiting Software/Services, Employee Performance Management, ELearning, Benefits Management, Expense Management  
    

IT Career Advancement

Cisco Certification, Microsoft Certification, Linux Certification, Network Security Certification, Software Development Certification  

IT Management

Employee Performance, ITIL, Productivity, Project Management, Software Compliance, Sarbanes Oxley Compliance, Service Management, Desktop Management  
    

Knowledge Management

Collaboration, Collaborative Commerce, Contact Management, Content Delivery, Content Integration, Content Management System, Corporate Portals, Customer Experience Management, Document Management, Information Management, Intranets, Messaging, Records Management, Search And Retrieval, Search Engines, Secure Content Management, SLA  

Networking

Active Directory, Bandwidth Management, Convergence, Distributed Computing, Ethernet Networking, Fibre Channel, Gigabit Networking, Governance, Grid Computing, Infrastructure, Internetworking Hardware, Interoperability, IP Networks, IP Telephony, Local Area Networking, Load Balancing, Migration, Monitoring, Network Architecture, Network Management, Network Performance, Network Performance Management, Network Provisioning, Network Security, OLAP, Optical Networking, Quality Of Service, Remote Access, Remote Network Management, Server Hardware, Servers, Small Business Networks, TCP/IP Protocol, Test And Measurement, Traffic Management, Tunneling, Utility Computing, VPN, Wide Area Networks, Green Computing, Cloud Computing, Power and Cooling, Data Center Design and Management, Colocation and Web Hosting  
    

Platforms

AS/400, Domino, Linux, Microsoft Exchange, Oracle, PeopleSoft, SAP, Siebel, Solaris, Tivoli, Unix, Web Sphere, Windows, Windows Server  

Security

Access Control, Anti Spam, Anti Spyware, Anti Virus, Application Security, Auditing, Authentication, Biometrics, Business Continuity, Compliance, DDoS, Disaster Recovery, Email Security, Encryption, Firewalls, Hacker Detection, High Availability, Identity Management, Internet Security, Intrusion Detection, Intrusion Prevention, IPSec, Network Security Appliance, Password Management, Patch Management, Phishing, PKI, Policy Based Management, Security Management, Security Policies, Single Sign On, SSL, Secure Instant Messaging, Web Service Security, PCI Compliance, Vulnerability Management  
    

Software Development

.NET, C++, Database Development, Java, Middleware, Open Source, Software Outsourcing, Quality Assurance, Scripting, SOAP, Software Testing, Visual Basic, Web Development, Web Services, Web Service Security, XML  

Storage

Backup And Recovery, Blade Servers, Clustering, IP Storage, ISCSI, Network Attached Storage, RAID, Storage Area Networks, Storage Management, Storage Virtualization, Email Archiving, Data Deduplication  
    

Wireless

802.11, Bluetooth, CDMA, GPS, Mobile Computing, Mobile Data Systems, Mobile Workers, PDA, RFID, Smart Phones, WiFi, Wireless Application Software, Wireless Communications, Wireless Hardware, Wireless Infrastructure, Wireless Messaging, Wireless Phones, Wireless Security, Wireless Service Providers, WLAN  
Search