La presentazione è in caricamento. Aspetta per favore

La presentazione è in caricamento. Aspetta per favore

WP3: Implementazione del prototipo al CNAF Status.

Presentazioni simili


Presentazione sul tema: "WP3: Implementazione del prototipo al CNAF Status."— Transcript della presentazione:

1 WP3: Implementazione del prototipo al CNAF Status

2 Obiettivi WP3  Cluster SoC, low-power, no-low-latency network  Technology tracking  Progettazione, installazione, gestione cluster  Testing, tuning e benchmarking  Installazione, configurazione e mantenimento dei tool software richiesti dagli altri WP per lo studio delle prestazioni e per l’utilizzo del cluster (compilatori, librerie, framework di sviluppo, etc.)  Verifica con gli altri WP delle alternative a software e compilatori/librerie non presenti (o rimaneggiati) per le nuove architetture SoC.  Studio e implementazione software di monitoraggio delle principali metriche di interesse del progetto  Il WP3 dipende dal WP2 fino al PM9 per quanto riguarda la decisione della piattaforma SoC su cui basare il cluster al CNAF

3 Status cluster CNAF pre-COSA

4 Status cluster CNAF (COSA “certified”) #ModelSoCISACoresTDP 1SupermicroIntel C2750 x86-648x Avoton20 6Jetson-k1-01Nvidia K1 ARMv74x A1515 1Odroid-XU3Samsung Exynos 5422 ARMv74x A154x A75 1CubieboardAllwinner A80 ARMv74x A154x A75 1Odroid-XUSamsung Exynos 5410 ARMv74x A15(4x A7)5 1ArndaleSamsung Exynos 5420 ARMv74x A15(4x A7)5 1SABREboardFreescale i.MX6 ARMv74x A95 12ARMv7 CardsARMv720x A154x A98x A740W

5 Cluster network MASTER (fanless x86-64) MASTER (fanless x86-64) ARM 10.0.0.0/24 eth1 eth0 CNAF ssh 172.16.11.0/23

6 Cluster services (provided by the master) FIREWALL ALLOW RULES SSH/APACHE FOR EVERYONE DHCP/BOOTP/DNS/TFTP/LDAP FOR CLUSTER NODES NATA UNIQUE EXTERNAL IP DHCPFOR CLUSTER NODES (A DEDICATED PORT) TFTPFOR BOOTP/PXE INSTALLATION NFSFOR CLUSER NODES LDAPFOR CLUSTER USERS (SLURM)FOR CLUSTER NODES SW IS CURRENTLY INSTALLED AS A BARE METAL (X86 HW) NEXT STEP: SW PACKED IN A VM OR A DOCKER CONTAINER

7 COSA power network Card 12V 5V 230V AC Power probe LabVIEW DC probe Card

8 PSU&Cables  PSU HX1000i  12 linee 12V (Jetson)  6 linee 5V  Cavi GRIDSEED  Da 1 MOLEX a 6 BARREL

9

10 Measure and lab power equipment POWER SUPPLY POWER ANALYZER DIGITAL MULTIMETER

11 AllWinner A80 Nvidia Tegra K1 Samsung Exynos 5422 CPU4x A15 + 4x A74x A154x A15 + 4x A7 L1 Cache32KB/32KB L2 Cache2MB + 512KB GPU PowerVR G6230 (64cores) Kepler GK20a (192 cores) ARM Mali-T628 MP6 GPU API OpenGL ES 3.0 OpenCL 1.x Directx 9.3 OpenGL ES 3.1 OpenGL 4.4 OpenCL 1.2 CUDA 6.0 Directx 12 OpenGL ES 3.0 OpenCL 1.1 DirectX 11 Decoder1080p30: H.265/VP91440p30: H.264/VP8 1080@120fps: H.264/VP8 Encoder4K@30fps: H.264 and VP81440p : H.264/VP8 1080@120fps: H.264/VP8 Memory Interfaces DDR3/DDR3L/LPDDR3 (8GB) Raw NAND 72-bit ECC eMMC v4.5 DDR3L,LPDDR3(8GB) eMMC 4.5 LPDDR3/DDR3 eMMC 5.0 TUNING A TRUE ETHEROGENEOUS ENVIRONMENT

12 AllWinner A80 Nvidia Tegra K1 Samsung Exynos 5422 USB 2x USB host 1x USB3.0/2.0 host / device HSIC 2x USB 3.0 3x USB 2.0 HSIC 2x USB 3.0 1x USB 2.0 1x HSIC Ethernet1x Ethernet MACN/A TS Interface No data1x TS SATAN/ASATA 3.1N/A PCIeN/A 5-lane PCIe with Gen1 (2.5GT/s) and Gen 2 (5.0 GT/s) speeds N/A Audio I/FPCM/I2SPCM/I2S, S/PDIF 1x PCM, 2x I2S, 1x S/PDIF Other I/Os4x SPI, 7x TWI, 7x UART 3x I2C, 2x SPI, UART, Up to 64 MPIO (Multi Purpose IO) 4x I2C, 7x HS-I2C, 3x SPI, 5x UART, GPIOs, 24-channel DMA controller

13 TUNING (1/4) Cores frequencies Max freq JETSON4 A152.3 GHz ODROID-XU34 A152.0 GHz 4 A71.4 GHz CUBIEBOARD4 A151.6 GHz 4 A71.2 GHz cpufreq utils for ARM/Intel cores nvi utils for Nvidia MP cores online/offline cores

14 TUNING (2/4) Memory bandwidth  Single and dual channel LPDDRx, DDRx  GPU CPU memory is shared (no data transfer between different memories as traditional CPU/GPU architecture)  Benchmarks: STREAM and others

15 TUNING (3/4) Storage speed  SD  eMMC 5.0 (supposed to be the fastest)  SSD (if SATA interface is present)  NFS Benchmark: IOZONE, BONNIES++ and others

16 TUNING (4/4) Net bandwidth and latency PHY ETHERNET LINK JETSONPCI-E ODROID-XU3USB-ETH0 BRIDGE CUBIEBOARD Directly to the MAC link of SoC Benchmarks: IPperf and others

17 Il mercato SoC  il mercato delle schede basate su SoC ARM dopo un biennio (2013/2014) ricco di nuovi prodotti è vistosamente rallentato (probabilmente per la transizione ARMv7 -> ARMv8)  Tegra X1 solo annunciato  Xeon-D solo annunciato  Buone notizie: i SoC ARMv8 nel mondo mobile sono già una realtà  Exynos 7420, Snapdragon 810, Mediatek Helio X10/X20, Kirin 930  A57/A72 e A53  Fino a 10 core (X20)

18 Next steps  Buy new hardware  Input by other WPs  Consolidate the cluster  Unique testing/benchmark framework  Unique GUI interface  User-friendly access  Automatic installation (Puppet/Foreman, etc.)  Continuous build integration  Github repository  Improve knoledge of bootloader (U-boot)  Install GPU tools others than CUDA (OpenMP4, OpenCL, C++ AMP, CILK PLUS, etc.)  Test new CPU/GPU in Android environment if Linux is not available (find differences betweeb GLibC and BioniC)


Scaricare ppt "WP3: Implementazione del prototipo al CNAF Status."

Presentazioni simili


Annunci Google