finding structure and meaning by similarity THE UNIVERSITY OF NEW MEXICO Margaret Werner-Washburne DavidsonGeorge SThe post-genomic challenge was to develop high-throughput technologies for measuring genome scale mRNA expression levels. Analyses of these data rely on computers in an unprecedented way to...
Modern high-throughput experimental approaches can generate very large quantities of data, requiring efficient computational approaches to process. For example, a recent pan-proteome analysis by Muller et al. [18] collected protein abundance measurements from 103 species, detecting a total of 904,134 ...
DNA methylation is a key epigenetic property that drives gene regulatory programs in development and disease. Current single-cell methods that produce high quality methylomes are expensive and low throughput without the aid of extensive automation. We pr
Examples of high dimensional data include high-throughput genomics6, economic and financial time series data7 and EEG data (with hundreds of channels). Another example, and the one that we are primarily interested in, is fMRI data8 with millions of voxels and hundreds of Regions of Interest (...
High-throughput measurement technologies produce data sets that have the potential to elucidate the biological impact of disease, drug treatment, and environmental agents on humans. The scientific community faces an ongoing challenge in the analysis of these rich data sources to more accurately characteri...
In this work, we exploit the fact that the CALPHAD framework offers a simple pathway to address this limitation of current high-throughput methods. While the integration of basic first-principles data with experimental data in the construction of CALPHAD thermodynamic database is an established techni...
We have presented a set of data structures and compression algorithms for high-throughput sequencing data. We have transformed the nucleotide sequences into location and mismatch information through a mapping procedure to a reference genome, then applied fixed codes to encode that location and mismatch...
Log File ReplicationThe number of log files must be taken into consideration early in the process. Synchronous replication of large volumes of log files requires significant bandwidth. We needed a sustained throughput of 11MB/sec to handle the log file traffic between datacenters at peak times in...
SQL Server Integration Services is a high throughput ETL system for managing high-performance data transfers. There are a large number of data transformation and cleansing capabilities available to the ETL designer as part of the Integration Services design ‘toolbox’. ...
M. High-throughput prediction of protein conformational distributions with subsampled alphafold2. gms_natcomms_1705932980_data, https://doi.org/10.5281/zenodo.10621196 (2024). Evans, D. J. & Holian, B. L. The nose–hoover thermostat. J. Chem. Phys. 83, 4069–4074 (1985). Article ADS ...