Home Solutions Life Sciences featurephoto_lifesciences

NIH National Center for Biotechnology Information Deploys StorHouse from FileTek for Critical Bioinformatic Application

StorHouse Entrusted with Vital Next-Generation Genomic Sequencing Data

Rockville, Maryland – March 18, 2010 -  FileTek, Inc., a leading global provider of large-scale data management and information governance solutions, today announced that the National Center for Biotechnology Information (NCBI) at the National Institutes of Health (NIH), the primary Federal agency for conducting and supporting medical research, recently purchased and deployed multiple FileTek StorHouse® storage virtualization and data management systems as an active archive and backup solution for storing, accessing, and protecting all critical NCBI Sequence Read Archive (SRA) data. SRA data is next-generation genomic sequencing information produced at numerous bioinformatic research facilities and centers throughout the United States. The information is then written to StorHouse and made available to a prestigious, global biomedical research community.

Currently, NCBI/NIH has deployed two production StorHouse systems and two backup/disaster recovery StorHouse systems at geographically dispersed locations in the Washington, D.C. metropolitan area. The long-range, two-year plan is to grow these installations to dozens of StorHouse instances in multiple locations to accommodate the anticipated 12-petabyte archive size. In addition, StorHouse is responsible for storing and backing up vital SRA project information created at the National Library of Medicine in Bethesda, Maryland.

FileTek wishes to acknowledge Cambridge Computer of Waltham, MA, for its role in helping the NIH Medical Library identify StorHouse as the best fit for the NCBI's technical requirements. "The NCBI storage repository is the most exciting storage project in the bioinformatics world today," commented Jacob Farmer, CTO of Cambridge Computer. "This system will be storing and protecting 12 petabytes of genomic sequencing data gathered from research sites across the country. FileTek's StorHouse has several unique features that made it stand out as the optimal choice for managing what is undoubtedly the most important cache of life sciences data in the world today."

"FileTek is pleased that NCBI/NIH chose StorHouse for its groundbreaking SRA application," remarked Chuck Whinney, the FileTek General Manager for StorHouse. "When faced with the ultimate challenge of designing the SRA application, NCBI/NIH realized early on that StorHouse met all system requirements, including ensuring data integrity, facilitating data access, and scaling to petabytes and beyond to support future growth. Furthermore, because the StorHouse systems at NCBI/NIH are media/hardware agnostic and use a blend of cost-effective storage options, they also provide lower storage and data management costs. As the SRA initiative progresses, FileTek looks forward to pursuing more projects at NIH and at other biomedical research centers around the world."

About StorHouse

StorHouse is an intelligent, hardware agnostic, storage virtualization and data management platform for archiving, retrieving, and backing up large volumes of relational and file-based information. The system supports an automatically managed pool of traditional and alternative storage devices, including tape, disk, solid state devices, and most other generally available storage technologies. Enterprises deploy StorHouse as an integral component of an overall enterprise storage infrastructure for active archive applications, database extension systems, information lifecycle management initiatives, digital preservation programs, and native file format backups of terabytes to petabytes of data residing on operational systems.

StorHouse has many unique features that support biomedical research applications such as the SRA program at NCBI/NIH. These features include:

  • All-in-one platform functionality that provides both archive and backup capabilities
  • An industry-standard relational file system layer that archives, retrieves, and/or backs up any format of unstructured data in native file format with no need for application modifications
  • Scalability to trillions of files and multiple petabytes of managed data with no performance degradation
  • A native file format backup approach that eliminates the need for traditional restore processing
  • Virtualized storage that cost-effectively uses traditional and alternative storage media to lower the total cost of data ownership and provide a measurable, cost-correct ROI
  • Automated system, storage, and data management, including storage allocation and control as well as data migration, replication, backup, recovery, and retention
  • Automatic content validation and repair processes to ensure data integrity and archive reliability
  • Easy migration to newer and more advanced technology as it becomes available with no performance degradation

About FileTek

FileTek, Inc. is a premier provider of large-scale data management and information governance solutions, enabling organizations, worldwide and across multiple industry segments, to efficiently manage, rapidly access, and effectively govern their ever-growing volume of enterprise data. Since 1984, FileTek has provided comprehensive, award-winning solutions to companies, prestigious educational institutions, and scientific and government agencies worldwide. From our patented and innovative StorHouse high-volume data management solutions to our Trusted Edge information classification and asset management software, FileTek maintains a steady focus: enhancing and automating information lifecycle management and data preservation processes for all categories and volumes of data. The company's website may be accessed at www.FileTek.com.

FileTek is headquartered at 9400 Key West Avenue, Rockville, MD 20850. Telephone: 301-251-0600. Fax: 301-251-1990. The FileTek international headquarters, FileTek Ltd, is located at One Northumberland Ave., London WC2N 5BW. Telephone: +44 (0) 207 872 5583. Fax: +44 (0) 207 753 2829. The company also has offices across North America.

About The National Institutes of Health

The National Institutes of Health (NIH), a part of the U.S. Department of Health and Human Services, is the primary Federal agency for conducting and supporting medical research. Helping to lead the way toward important medical discoveries that improve people's health and save lives, NIH scientists investigate ways to prevent disease as well as the causes, treatments, and even cures for common and rare diseases. Composed of 27 Institutes and Centers, the NIH provides leadership and financial support to researchers in every state and throughout the world. For over a century, the National Institutes of Health has played an important role in improving the health of the nation. The NIH traces its roots to 1887 with the creation of the Laboratory of Hygiene at the Marine Hospital in Staten Island, NY. The NIH is an agency of the U.S. Department of Health and Human Services. With the headquarters in Bethesda, Maryland, the NIH has more than 18,000 employees on the main campus and at satellite sites across the country.

With the support of the American people, the NIH annually invests over $28 billion in medical research. More than 83% of the NIH's funding is awarded through almost 50,000 competitive grants to more than 325,000 researchers at over 3,000 universities, medical schools, and other research institutions in every state and around the world. About 10% of the NIH's budget supports projects conducted by nearly 6,000 scientists in its own laboratories, most of which are on the NIH campus in Bethesda, Maryland.

About Cambridge Computer

Cambridge Computer provides professional services, sales, and education in the fields of storage networking, data protection, and digital preservation. Founded in 1991, Cambridge Computer is best known in the data storage industry for authoring best practices for enterprise backup and archiving, and for its role in defining use cases for next generation storage networking technologies. Cambridge Computer's clients span across all industries and range in size from small business to the Fortune 50. FileTek has partnered with Cambridge Computer to explore new markets, particularly in life sciences and higher education.

©2010 FileTek. All rights reserved. FileTek, StorHouse, and Trusted Edge are U.S. registered trademarks of FileTek, Inc. Other trademarks presented herein are the property of their respective owners. The following U.S. patents protect StorHouse: 4,864,572; 5,247,660; 5,727,197; 6,049,804.


Today’s requirements to digitally preserve and access massive amounts of information is an enormous operational and fiscal challenge for many organizations.  Directly impacted by this challenge are the biologists, chemists, and medical researchers working in the fields of bioinformatics and biotechnology.   Their life-saving work in the study of genomics can easily generate massive amounts of information measured in the petabytes.   

Diamond Lauffin, co-founder of Nexsan Technologies North America and a leading authority in the field of storage and storage systems will discuss the challenges, requirements and available technologies that are being implemented to successfully manage, archive, backup, and access this critical biomedical research information.  The presentation will take place at "The Petabyte Challenge" - a conference focused on addressing the challenges associated with managing the explosion of research-related data.  The conference is being presented by the Association of Independent Research Institutes (AIRI) and will be held on February 18th and 19th at the Salk Institute of Biological Studies in La Jolla, California.

Mr. Lauffin was directly involved in providing multi-petabyte solutions to organizations such as Fermi Labs and JPL/CalTech as well as many other University, Scientific and Commercial installations.  Mr. Lauffin will be sharing his personal experience with the recent evaluation and roll-out of a multi-petabyte 'active archive' solution at a major national laboratory.  The deployed solution serves as the core storage component for the capture and sharing of next-generation genomic sequencing information produced at numerous bioinformatic research facilities and centers throughout the United States. Furthermore, he will discuss their selection process and subsequent deployment process for the FileTek StorHouse® storage virtualization and data management system as an active archive, backup and redundant on-line, off-site secondary site solution for their 12-petabytes of research data.  

Mr. Lauffin will illustrate how a cost livable and cost justifiable approach was deployed to support a combined total of 24-petabytes of synchronized, on-site, off-site, on-line managed data.  While the example discussed in this session may appear unique, these same storage challenges and requirements confront industries and organizations of all sizes.  Please join Mr. Lauffin for this informational and entertaining session.