The technique presented here achieves simultaneous optimization of schedule time and data path component cost within a structured data path architecture, using a genetic algorithm. The data path architecture has been designed to overcome the problem of random interconnections between data path components ...
Syed Zawad, Cheng Li, Zhewei Yao, Elton Zheng, Yuxiong He, Feng Yan. (2023) DySR: Adaptive Super-Resolution via Algorithm and System Co-design.ICLR:2023. Sheng Shen, Zhewei Yao, Chunyuan Li, Trevor Darrell, Kurt Keutzer, Yuxiong He. (2023) Scaling Vision-Language Models with Sparse Mixtu...
Fork1 Star0 master 1Branch Tags Code This branch is83 commits behindMingSun-Tse/Efficient-Deep-Learning:master. README MIT license EfficientDNNs A collection of recent methods on DNN compression and acceleration. There are mainly 5 kinds of methods for efficient DNNs: ...
When the number of rules exceeds a certain range, compared with bit representation, the performance of the algorithm will be rescued and the resource consumption will be in increased. Fong et al. [26]proposed the first 11-field classifier algorithm compatible with the OpenFlow table. This ...
The compiler examines the data flow graph of the programs and partitions it into clusters whenever it exceeds the queue limits of the target architecture. The presented algorithm deals with the two factors that affect the utilization of the queue, namely parallelism and the length of variables' ...
At the beginning of the Thanos project, we decided to reuse a very naive compaction algorithm for simplicity. We calculated that, in theory, we don’t need to make the compaction process parallel within a single block of data. Given a steady stream of 100 GB (or more) of eventually ...
(a) Subblock matrix multiplication scheme with timing information; (b) algorithm mapping and data sharing on SmartCell system. Full size image Figure 8 Pipelined computations for one subblock result of matrix C. The data in red circle denotes the external inputs in each time step. Full size...
Based on the extended directed graph and embedded OSEK/VDX OS model, the key processes of autoC for translating a given OSEK/VDX application into a sequential model are formalized in Algorithm 1. In the translation processes, autoC does not compute the values of variables; instead, it just exp...
Aeron Cluster provides support for fault-tolerant services as replicated state machines based on the Raft consensus algorithm.Performance is the key focus. A design goal for Aeron is to be the highest throughput with the lowest and most predictable latency of any messaging system. Aeron integrates ...
SirixDB shares unchanged data pages as well as records between revisions, depending on a chosen versioning algorithm during the initial bootstrapping of a resource. SirixDB aims to balance read and writer performance in its default configuration. Concurrent SirixDB contains very few locks and aims ...