How to improve performance and decrease power in neural networks.
Executing a neural network on top of an NPU requires an understanding of application requirements, such as latency and throughput, as well as the potential partitioning challenges. Sharad Chole, chief scientist and co-founder of Expedera, talks about fine-grained dependencies, why processing packets out of order can help optimize performance and power, and when to use voltage and frequency scaling versus clock gating.
[embedded content]
Ed Sperling
(all posts)
Ed Sperling is the editor in chief of Semiconductor Engineering.
- SEO Powered Content & PR Distribution. Get Amplified Today.
- PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
- PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
- PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
- PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
- Source: https://semiengineering.com/application-optimized-processors/
- :is
- 25
- 27
- 41
- 66
- 7
- 80
- a
- About
- All
- All Posts
- an
- and
- Application
- AS
- CAN
- challenges
- chief
- Clock
- Co-founder
- content
- decrease
- dependencies
- ed
- editor
- embedded
- Engineering
- Ether (ETH)
- Frequency
- help
- HTTPS
- improve
- in
- jpg
- Latency
- network
- networks
- Neural
- neural network
- neural networks
- of
- on
- Optimize
- order
- out
- packets
- performance
- photo
- plato
- Plato Data Intelligence
- PlatoData
- popularity
- Posts
- potential
- power
- processing
- processors
- Requirements
- requires
- ROW
- scaling
- Scientist
- semiconductor
- such
- Talks
- The
- throughput
- thumbnail
- to
- top
- understanding
- use
- Versus
- Voltage
- WELL
- when
- why
- youtube
- zephyrnet