Aurora hpc solutions value
-
Upload
eurotechhpc -
Category
Technology
-
view
687 -
download
1
description
Transcript of Aurora hpc solutions value
Aurora solutions
Value creation
AURORA
Aurora solution benefits
Aurora supercomputers from Eurotech
• Aurora is the name of the Eurotech liquid cooled supercomputers, which excel in
• Computing power• Density• Energy efficiency• Reliability• Availability• Compatibility• Cost effectiveness
Why Aurora solutions?
ScalabilityLinear scalability to users from Gigaflops to Exaflops
Compatibility and flexibilityChoice of interconnectsX86 based systems
GreenAurora data centers consume 50% less power than data centers based on standard air cooled technology
CompetenceHPC division expertiseEnd to end solution deploymentFlexibility to work with final customer, SI and as OEM
ReliabilityHigh qualityNo moving parts and reduced hot spots
High performance and high densityFastest available technology for high density computational power
High Performance and High Density
• Aurora uses the fastest Intel technology available− Intel Xeon E5 (Sandy Bridge) on the latest Aurora HPC 10-10− Optional GPU accelerators as part of the solution
• Fast Infiniband Interconnects− Water cooled infiniband switches included in the systems (one every chassis)
• 3D Torus− High speed up to 60 GB/sec 3D torus network based on FPGA− In collaboration with research institutions like I.N.F.N, TNW and FBK, Eurotech has
developed one of the fastest and most reliable 3DTorus in the market
• FPGA accelerators− Aurora nodes have an on-board FPGA that can be programmed as an accelerator
• High Density– Aurora systems can pack 2 PetaFlops in just 30 m2, the size of a studio flat.
Scalability
• Hot pluggable modular system– All components of the Aurora system are hot pluggable. Aurora can be scaled from a
single chassis with 16 nodes to multiple chassis in a rack and into a system with multiple racks
• 3D torus network− Next neighbor network with no switches and no bottlenecks facilitates scalability
• Infiniband– Fast interconnections allow low latency communication
• Synchronization network– Very fast channel and global commands with subdomain manageability– Low/high level synchronization
• Aurora provides the customers with no moving parts:– There are no spinning discs and no fans for cooling on-board heat generating
components– As a result, there are no vibrations that can harm memory and rotating discs,
increasing the longevity of components and, as a consequence, of the whole system.
• Aurora cooling limits hot spots− Improved cooling infrastructure and direct to component heat removal allows
uniformity in heat production/removal, limiting hot spots and hence another cause of failures
• Monitoring and resilience– Independent sensor networks– Redundancy of all components including networks– Choice of pro active support (preventive maintenance)
• Eurotech commitment to quality– Eurotech produces its boards in its Japanese plant, following high quality standards– Eurotech HPC selects the best in class supplier to set up the Aurora solutions
Reliability
Flexibility and compatibility
• Choice of interconnects – Aurora computational units offer both an Infiniband network and a 3D Torus
interconnection– The customer can choose what is the best technology to be implemented accordingly
to the nature of the computational problem to be solved
• x86 based solution– Aurora supercomputers are based on x86 processors– Intel cluster ready certified
• A choice of software– Aurora can run a vast variety of software, both open source and commercial
• A flexible solution approach– Eurotech HPC can design solutions involving accelerators, storage and software,
according to customer requirements
Everyone nowadays claims to be «green» – but are they?
Aurora Green proposition:
• Energy efficiency (achievable datacenter PUE of 1.05)• Direct on component water cooling• High reliability means less spare parts and hence less waste• 230 AC to 10 V DC in 2 steps for a power conversion efficiency between
93% and 97%• Free cooling (heat exchangers rather than chillers and AHU)• Thermal energy recovery• High density for floor space savings• Noiseless operations means better work environments
Green and environmental
• 10 years+ of HPC experience:– An extended experience in top HPC projects– Collaborated with best in class research center to develop advanced supercomputer
prototypes– Developed experience in delivering large systems (15M$+)
• Structure and agility:– Eurotech HPC division benefits from the structure coming from the Eurotech group,
while keeping the typical agility of a start up– This means prompt adaptability to customer needs. Eurotech can deliver end to end
HPC solutions inclusive of supercomputers, storage, sotware and services. But they can also work with system integrators, delivering parts of larger systems and as OEM for larger vendors
• Financial solidity– Differently from the pure HPC players in the market, Eurotech income statements
relies on multiple line of business, helping the HPC division to smooth revenues and to rely on abundant resources
Competence
AURORA
Aurora total cost of ownership
Data center TCO drivers
Driver Cost components
IT CAPEX Initial SW and HW capital expenditures
Space occupancy (footprint)
Cost of the occupied space and auxiliary infrastructure: rent, opportunity cost, civil, structural and engineering, permits and taxes
Data center infrastructure Electrical (UPS, generator, cables…)Cooling (Chillers, AHUs, heat exchangers, pumps…)
Installation Delivery costs, installation, test and tuning of IT, electrical and cooling equipment
Energy Cost of energy: IT, cooling, lighting and waste
Maintenance and additional operation costs
Warranty extensions, support, software licenses, IT maintenance, electrical and cooling maintenance, facilities maintenance, costs of outages, heating, security
Other: disposal, green Costs of end of life, carbon footprint (missed) incentives, fines…
Main Areas of Aurora systems impact on TCO
• Area 1: energy savings– Lower costs due to energy cost optimization and better PUE
• Area 2: density (FLOPS/ m2) – Aurora density allows for space, racks, electrical, cooling and network
savings
• Area 3: reliability– Aurora reliability contributes to lower maintenance costs and outage
business costs
• Area 4: liquid cooling
– A part from the energy savings that water cooling implies, it also allows to save on the capital costs of cooling infrastructure
TCO - energy“Typical” power breakdown in datacenters
Data from APC
Image from APC
TCO- energyPower breakdown in an Aurora datacenter leading to a PUE of 1.05
• Example of energy savings compared to 2 alternative air cooled solutions• Rationale: less energy spent in cooling and less energy wasted in power
conversion.
Savings compared to an air cooled 1U servers based
data centre (PUE = 2.13)
Savings compared to an air cooled blade servers based
data centre (PUE = 1,6)
Total energy savings in 5 years compared to alternative solutions
€ 2,820,000 € 1,200,000
TCO- energyEnergy savings
TCO - density
LESS Flops/m2
+ servers
+ maintenance costs
+ electrical
+ cooling
+ volume occupancy(m3)
+ civil, structural and engineering
costs
+ energy costs
+ raised floor costs
+ space occupancy (m2)
+ IT hardware costs, like racks
+ network, electrical and
cooling hardware costs
N is related to density
Reliability impacts TCO in 2 ways:- Direct costs, associated with spare parts, extended warranties, support
personnel- Indirect costs, related to the business cost associated to an outage
The direct costs depend on the number of components and their estimated FIT (failure in time) rate, as demonstrated in the MTBF equation where is the failure rate of the single component and N the number of components
is related to quality, operating conditions,
monitoring and preventive
maintenance of components
MTBF =
The indirect impact depends on organisation and could range from thousands to millions € per hour of outage. So the impact of low reliability on the business could offset any saving reached during purchase and installation of IT solutions!
TCO - reliability
Adopting liquid cooling technology, it is possible to avoid most of air conditioning used to cool the IT equipment
This bear some saving coming from the avoidance of chillers, AHU, CFD (computational fluid dynamics), raised floor, air conditioning tuning
Liquid cooling infrastructure is generally cheaper relying on components like piping, pumps and free coolers.
If we take a 1 MW installation and we consider a cost of air cooling infrastcrure of 3000$/KW, the toal cooling would be 3M$. The same 1MW data center would probably require roughly 10% of that expenditure
TCO – liquid cooling
TCO example: comparison among 3 systems (500 Tflops)*
The best air cooled competitive solution has to be 100% cheaper to match TCO!!!
The standard air cooled competitive solution has to be 75% cheaper to match TCO!!!
*Calculations assume that cost of hardware and software is the same in the 3 cases
Air cooled 1U servers Air cooled blade server Liquid cooled blade server
Processor Xeon 5600 Xeon 5600 Xeon E5
Cost of energy $2,720 $1,560 $510
Retuning and additional CFD $17 $6 $0
Total outage cost $500 $390 $160
Preventive maintenance $150 $150 $150
Annual facility and infrastructure maintenance. $670 $380 $130
Lighting $14 $5 $2
Annualized 3 years capital costs $3,520 $3,250 $3,040
Annualized 10 years capital costs $1,770 $1,100 $440
Annualized 15 years capital costs $380 $120 $30
ANNUALIZED TCO (K od USD) $9,741 $6,961 $4,462
Different processors are used
TCO example: comparison among 3 systems (500 Tflops)*
The best air cooled competitive solution has to be 55% cheaper to match TCO!!!
The standard air cooled competitive solution has to be 40% cheaper to match TCO!!!
*Calculations assume that cost of hardware and software is the same in the 3 cases
Air cooled 1U servers Air cooled blade server Liquid cooled blade server
Processor Xeon 5600 Xeon 5600 Xeon 5600
Cost of energy $1,690 $1,560 $910
Retuning and additional CFD $14 $6 $0
Total outage cost $390 $390 $330
Preventive maintenance $150 $150 $150
Annual facility and infrastructure maintenance. $450 $390 $240
Lighting $11 $5 $2
Annualized 3 years capital costs $3,390 $3,270 $3,220
Annualized 10 years capital costs $1,100 $1,100 $820
Annualized 15 years capital costs $300 $130 $40
ANNUALIZED TCO (K od USD) $7,495 $7,001 $5,712
Same processors are used